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ABSTRACT 


This  report  presents  several  significant  results  on 

deterministic  processor  scheduling.  For  some  minimal  length 

problems,  polynomial  algorithms  are  given;  namely,  an  0(nt>) 

algorithm  for  two  special  types  of  three-processor  flow 

2 

shops  and  an  0(n  )  algorithm  for  an  m-processor  bound  system 
with  equal  execution  time  tasks  on  two  task  cnains.  Using  a 
dynamic  programming  approacn,  an  O(n^I)  algorithm  is  also 
outlined  for  a  tree-structured  set  of  equal  execution  time 
tasks  on  a  2-processor  bound  system,  where  I  is  the  numoer 
of  terminal  subsets  of  the  tree.  This  algorithm  is  not 
polynomial  in  n  but  is  a  significant  improvement  over  the 
alternative  of  simple  enumeration. 

Several  problems  are  also  shown  to  be  NP-^mplete . 

These  are  minimizing  schedule  length  on  two-procesor  bound 
unit  execution  time  task  systems  even  when  the  precedence 
constraints  consist  of  chains  only,  2-maximal  three- 
processor  flow  shops  and  1  or  3  maximal/minimal  flow  shops, 
and  minimizing  the  mean  flow  time  on  the  two- ~ rocessor  open 
shop,  with  the  exception  of  the  2-maximal  flow  shop,  the 
results  are  strong  NP-complete  reductions  to  the  3-PARTITIGU 
pr oolem . 

Finally,  in  the  area  of  performance  bounds,  tight 
bounds  are  obtained  on  the  lengths  of  list  schedules  on 


v 


' 

. 


identical  processors  tor  independent  tasks  witn  similar 
execution  times,  and  on  the  mean  flow  times  of  arbitrary  and 
SPT  schedules  for  the  open  shop. 
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Cnapter  One 


INTRODUCTION 


The  scheduling  problem  is  concerned  with  the  allocation 
of  resources  over  time  to  perform  a  collection  of  tasks.  It 
is  deterministic  when  the  information  describing  the  tasks 
is  assumed  to  be  known  in  advance;  that  is,  the  task  system 
is  static  as  opposed  to  dynamic  systems  in  wnich  new  tasks 
may  be  acioed  while  scheouling  is  in  progress.  The  problem 
may  oe  Droken  down  into  that  of  allocation  and  sequencing. 

In  other  words,  given  a  set  of  processors  and  tasks,  a 
schedule  is  ruiiy  specified  wnen  the  tasKS  to  be  performed 
on  each  processor  are  determined  (allocation)  and  the  order 
in  whicn  the  tasks  on  each  processor  will  oe  done  is  given 
(sequencing)  subject  to  prespecifiea  constraints. 
Occasionally,  However,  only  one  of  these  two  elements  may  be 
present . 


Scneouling  may  be  viewed,  firstly,  as  a  decision-making 
function  and,  secondly,  as  a  body  of  theory.  This  thesis  is 
concerned  mainly  with  the  task  of  furthering  understanding 
of  the  scneouling  problem  from  the  theoretical  viewpoint.  In 
this  cnapter,  the  scheduling  function  and  the  theoretical 
problem  are  briefly  discussed  ano ,  finally,  an  outline  of 
tne  remaining  chapters  is  presented. 
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1.1  The  Scheduling  Function 

As  a  aecision-making  function,  scheduling  is  of  general 
practical  value;  one  hardly  need  dwell  on  motivations  for 
its  study.  Its  applicability  ranges  from  the  simple,  sucn  as 
organizing  an  effective  workday  or  preparing  a  meal,  to  the 
most  complex,  such  as  planning  the  operations  for  a  large 
computer  installation  or  joo-shop  environment. 

An  important  element  of  the  scheduling  function  is  that 
of  evaluating  a  set  of  alternative  courses  of  action  and 
selecting  the  cheapest  depending  on  stated  objectives.  In 
general,  the  measurement  of  the  costs  in  a  system  due  to 
scheduling  decisions  is  a  difficult  task,  however,  three 
measures  of  performance  have  become  prevalent,  namely, 
efficient  utilization  of  resources,  rapid  response  to 
oemanos,  and  close  conformance  to  prescribed  aeadlines. 

These  are  closely  modelled  when  scnedules  are  designed  to 
minimize  tne  time  taken  to  finisn  all  the  tasks,  to  minimize 
the  mean  time  tasks  spend  in  the  system,  and  to  minimize 
lateness  or  tardiness.  Thus,  the  mathematical  models  with 
which  scheduling  theory  is  concerned  are  of  value  in  their 
ability  to  represent  the  general  structure  and  essential 
properties  of  real  life  scheduling  problems. 
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1.2  Scheduling  Theory 

Scheduling  models  come  in  several  variations  depending 
upon  tne  original  problems  from  wnicn  the  models  are 
abstracted.  The  general  model  given  below  captures  tne 
essence  of  most  of  these  models  and  serves  to  present  the 
basic  notation  which  will  be  employed  subsequently. 
Variations  wnicn  are  covered  in  subsequent  chapters  will  be 
noted  as  they  are  encountered. 

From  the  theoretical  viewpoint,  scheduling  problems  are 
problems  of  combinatorial  optimization.  Consequently, 
several  well  known  approaches  (Aho,  Hopcroft  &  Oilman,  1974; 
Baker,  1974;  Lawler,  1976;  vvleide,  1977;  etc)  for  studying 
such  problems  are  applicaole.  Conversely,  advances  in 
scneduling  theory  tend  to  nave  similar  effects  on  other 
problems  of  combinatorial  optimization.  After  the 
presentation  of  tne  general  model,  this  section  is  concluded 
with  a  Discussion  ot  problem  classification  and  reduction, 
the  techniques  for  ootaining  efficient  algorithms,  and 
heuristic  approaches.  The  ideas  are  applicable  to  any 
combinatorial  optimization  problem  but  the  discussion  is 
centered  mainly  on  their  application  to  scneduling  problems. 


. 
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1.2.1  The  General  Model  and  Basic  Notation 

The  model  has  the  following  constituents;  (a) 
resources,  (b)  task  system,  (c)  sequencing  constraints  and 
(o)  the  performance  measure  to  be  applied. 

(a)  The  resources  consist  of  a  set  of  m  processors 

{P, , . . . ,P  }.  In  the  most  general  model  there  is  also  a  set 
1  m 

of  additional  resource  types,  but  this  will  not  be  covered 
in  this  thesis. 

(b)  The  task  system  can  be  defined  as  the  system 
( {  T  ± }  ,  <,  { ti  [  j  ]  >  ,  { Vv± }  )  as  follows: 

(1)  { T ^ }  ,  l^iin,  is  a  set  of  n  tasks  to  be  executed. 

(2)  <  is  an  ( irref lexive)  partial  order  defined  on  the  set 
of  tasks  whicn  specifies  operational  precedence 
constraints.  That  is,  <  Tj  implies  that  T^  must  be 
completed  before  the  execution  of  Tt  can  begin. 

(3)  ti[j]  2  0  is  the  time  required  to  execute  task  T^  on 

processor  P  ,  l£i£n,  When  the  execution  time  is 

independent  of  the  processor,  the  second  index,  j,  is 
dropped  giving  time  t.  for  task  rI\  . 

(4)  The  weights  W.,  lii^n,  are  interpreted  as  deferral  cost 
rates  and  are  assumed  to  be  constant.  Thus,  the  cost  of 
finishing  task  T.  at  time  t  is  simply  w^t. 

(c)  Schedules  may  be  preemptive  or  nonpreemp t ive .  In  a 
nonpr eemptive  schedule,  a  task  cannot  be  interrupted  once  it 
has  begun  execution.  A  preemptive  schedule  allows  a  task  to 


5 


be  interrupted  and  removed  from  the  processor  under  the 
assumption  that  it  will  eventually  receive  all  of  its 
required  execution  time,  ano  there  is  no  loss  of  execution 
time  due  to  preemptions. 

(d)  For  any  task  T^,  the  time  at  which  the  execution  or 
is  started  will  De  denoted  by  S(T^)  and  the  time  at  which 
T ^  is  completed  will  be  denoted  oy  F(T^).  Of  the  three 
performance  measures  mentioned  earlier,  only  the  first  two 
will  be  considered,  namely,  minimizing  the  schedule  length, 
w  =  MAX{ F ( Tv ) j  for  all  tasks  TO  and  minimizing  the  mean 
weighted  flow  time,  mwft  =  ( V\FF (TC) ) .  When  the  weights 

W i  are  all  the  same,  the  mean  weighted  flow  time  will  be 
referred  to  simply  as  mean  flow  time  or  mft. 

1.2.2  Problem  Classification  and  Reduction 

It  is  useful  to  oe  able  to  classify  combinatorial 
problems  according  to  their  degree  or  complexity.  Since  the 
reports  of  Cook  (1971)  and  Karp  (1972)  classification  into 
the  classes  P  and  Rp  have  become  widely  used.  These  problem 
classes  were  originally  oerined  in  connection  with  language 
recognition  problems  but  have  found  application  in  other 
disciplines.  An  algorithm  is  said  to  be  polynomial -bounded 
if  its  worst-case  complexity  is  bounded  by  a  polynomial 
function  of  the  input  size,  that  is,  if  there  is  a 
polynomial  g  such  that  for  each  input  of  size  n  the 
algorithm  terminates  after  at  most  g (n)  steps.  A  problem  is 
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polynomial -bounaed  if  there  is  a  polynomial-Dounded 
algorithm  for  it.  The  reason  for  the  criterion  ot 
polynomial-bounoeoness  is  that,  in  general,  a 
polynomial-bounded  algorithm  will  complete  its  task  in 
reasonable  time.  For  problems,  polynomial -boundedness 
approximates  one's  intuitive  notion  of  tractability  while  a 
problem  wnose  solution  requires  exponential  time  may  be 
looked  upon  as  intractable.  Roughly  speaking,  the  class  P 
may  oe  loentilieo  witn  the  class  or  problems  for  which 
polynomial-bounded  algorithms  exist,  wnereas  all  problems  in 
the  class  RP  can  be  solved  by  polynomial-depth  backtrack 
search . 

In  this  context,  all  problems  must  be  stated  in  terms 
of  recognition  problems  that  require  a  yes/no  answer.  For 
scheduling  problems,  tne  transformation  is  accomplished  by 
specifying  a  target,  D,  on  the  performance  measure  and 
asking  it  there  is  a  schedule  with  performance  measure  not 
exceeding  D. 

The  following  concept  of  problem  reduction  is  essential 
for  further  unde rs tanaing  of  the  P  and  NP  classes.  A 
problem,  X,  is  reducible  to  problem  Y  if  for  any  instance  of 
X  an  instance  of  Y  can  be  constructea  in  polynomial-bounded 
time  such  that  solving  the  instance  of  Y  will  solve  the 
instance  of  X  as  well.  Problem  Y  is  RF-complete  if  Y  is  in 
tne  class  HP  ano  X  is  reducible  to  Y  for  every  problem  x  in 


Imp. 


( 


7 


A  good  (that  is,  polynomial-bounaed)  algorithm  for  any 
NP-comple te  problem  could  be  used  to  construct  good 
algorithms  tor  every  problem  in  NP.  However,  no  such 
algorithm  has  been  found  and  it  seems  likely  that  none  ever 
will.  Many  problems  of  a  combinatorial  nature  (Garey  & 
Jonnson,  1976a)  and  in  particular  many  scneduling  problems 
(Ullnian,  1974  ,  1975;  Garey,  Johnson  &  Sethi,  1976;  Gonzalez 
6c  Sanni,  1976;  Lenstra  6t  Rinooy-Kan,  1978;  etc)  are  in  the 
class  HP-complete.  In  cnapters  3,  4  and  5  several  new 
auditions  are  maoe  to  this  ever  growing  list. 

Generally,  scneuuling  problems  trivially  Delong  to  the 
class  NP.  (Oilman,  1974)  Consequently,  in  oroer  to  prove 
NP-comple teness  it  is  sufficient  only  to  demonstrate  a 
polynomial  reduction  from  a  known  NP-complete  problem.  The 
known  NP-complete  problems  which  are  used  in  subsequent 
chapters  are  PARTITION  and  3-PARTITION,  defined  (Gonzalez, 
Jonnson  6<  Sethi,  1976)  as  follows: 

PARTITION: 

Given  a  set  {a^,...,an)  of  n  non-negative  integers 

wnose  sum  is  2K,  ooes  there  exist  a  subset  u  of  the 

indices  {l,...,n}  such  that  X . _  (a.)  =  K? 

i6u  l 

3- PARTITION: 

Given  a  set  {a1,...,a3n}  of  3n  non-negative  integers 
wnose  sum  is  nK,  ano  K/4  <  ai  <  K/2,  ooes  there  exist  a 
partition  of  tne  integers  into  n  disjoint  groups  of 


. 


three  elements  each  such  that  each  group  sums  exactly 

to  K? 

So  far,  questions  concerning  the  actual  nature  of  the 
computer  perforniing  the  algorithms  and  the  manner  in  whicn 
tne  size  of  a  problem's  input  is  to  be  measured  have  been 
left  untoucnea.  Experience  indicates  that  the  actual  nature 
of  the  computer  is  relatively  unimportant  as  far  as 
algorithm  complexity  is  concerned  so  long  as  the  assumptions 
made  are  reasonaole.  It  is  assumed  here  that  the 
hypothetical  computer  is  capable  of  executing  conventional 
instructions  sucn  as  integer  aritnmetic,  numerical 
comparisons  and  branching  operations.  Normally,  each 
instruction  takes  one  unit  of  time  and  unlimited  ranoom 
access  memory  is  available,  however,  the  actual  amount  of 
the  memory  required  for  each  algorithm  will  be  determined  to 
within  a  constant  factor. 

The  question  of  measuring  input  size  is  much  more 
dirficult.  Some  refinement  in  the  classification, 
i^p— r-r.mple  te ,  is  possible  according  to  the  manner  in  which 
data  is  encoded.  For  scheduling  problems  the  input  size  is 
normally  taken  to  be  either  the  number  of  tasks  in  the 
system  or  the  sum  of  execution  times  of  ail  tasks  in  the 
system,  bsuaiiy,  NP-complete  results  using  the  former 
measure  are  weaker  than  those  using  tne  latter.  Garey  and 
Jonnson  (lb>7b)  present  a  comprehensive  discussion  of  tnis 
topic.  Following  their  terminology,  3-PARTITI01Ni  and  problems 
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snown  NP-complete  by  reduction  from  3-PARTITION  in  this 
thesis  are  strongly  NP-complete  as  opposed  to  PARTITION  ana 
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proolems  reaucea  from  PARTITION. 

1.2.3  Solution  Techniques 

The  techniques  that  have  been  found  useful  in  solving 
scheduling  ana  inaeed  combinatorial  optimization  problems  in 
general  include  linear  programming,  recursion  and 
enumeration,  ranaorn  sampling  and  heuristics.  Baker  (1974) 
Lawler  (1976)  discuss  these  methods  in  some  aetail. 
Recursion  and  enumeration  include  dynamic  programming 
(Sahni,  1976;  Baker  &  Schrage,  1978),  branch  and  oound  (also 
called  backtrack  programming)  (Ignall  &  Scnrage,  1965; 

Kohler  &  Steiglitz,  1974)  ana  neighbourhood  search 
techniques  (Konier  &  Steiglitz,  1971).  Several  or  these 
tecnniques  are  illustrated  by  material  in  ensuing  cnapters. 

A  tecnnique  that  seems  peculiar  to  scheauling  is  that 
of  aajacent  pairwise  interchange  as  exemplified  by  Johnson's 
(1954)  algorithm.  In  this  case,  an  arbitrary  schedule  is 
modified  and  improved  by  considering  the  effect  on  the 
scheaule's  performance  of  intercnang ing  two  aajacent  tasks. 

As  noted  earlier,  for  those  problems  which  have  been 
proved  NP-compiete,  it  is  highly  unlikely  that  gooa 
algorithms  will  ever  be  found.  Hence,  emphasis  shifts  to  the 
aesign  of  neuristic  methods.  Rather  than  ensure  that  an 
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optimal  solution  is  obtained ,  such  methods  try  to  provide 
reasonably  good  solutions  in  polynomial  time.  An  important 
analysis  problem  for  such  algorithms  is  the  determination  of 
now  far,  in  some  sense,  from  the  optimal  the  heuristic  can 
get.  Well-known  heuristics  include  keep-busy  or 
greedy -processor  scheduling,  priority-list  or  simply  list 
scheduling,  and  SPT  and  LPT  rules  (Coffman,  1974).  In 
gr eeoy-processor  schedules,  a  task  is  assigned  to  a 
processor  as  soon  as  the  processor  is  idle  ana  tne  task  is 
ready  to  oe  executed,  that  is,  ail  preceding  tasks  have  been 
executed.  List  schedules  are  keep-busy  schedules  in  which 
tasks  are  selected  for  execution  on  available  processors 
according  to  a  pr e-de termined  priority  rating.  SPT  (LPT) 
schedules  are  list  schedules  in  which  highest  priority  is 
given  to  the  task  with  the  shortest  (longest)  execution 
t  ime . 

A  fairly  recent  technique  ror  combinatorial 
optimization  problems  is  that  of  approximation  algorithms 
(Sahni,  1975;  Ibarra  &  Kim,  1975).  Such  algorithms  are 
guaranteed  to  produce  in  polynomial  time,  solutions  that  are 
arbitrarily  close  to  the  optimal.  The  application  of  this 
idea  to  scheduling  problems  has  not  received  much  attention 
(Sahni,  1976,  gives  some  results  for  independent  tasks)  ana 
this  seems  to  be  a  promising  area  for  further  investigation. 
'Note,  however,  the  results  of  (Garey  &  Johnson,  1978)  which 
indicate  that  for  strong  NP-complete  problems,  fully 
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polynomial  approximation  schemes  cannot  be  obtained. 

1.3  Outline  of  Thesis 

For  most  models,  the  scneouling  problem  in  its  most 
general  form  has  been  shown  iMP-comple  te ,  as  previously 
notea.  however,  the  development  and  analysis  of  heuristic 
methods  as  well  as  the  classification  of  interesting  ano 
useful  restncteu  models  remains  an  important  study,  in  this 
thesis,  several  scheduling  mooels  are  considered  ana  a 
number  of  restricted  types  of  scheduling  problems  are 
solved . 

Chapter  2  is  concerned  with  the  question  of  determining 
tight  bounds  for  the  lengths  of  list  schedules  of 
independent  tasKS  on  m  identical  processors  in  terms  of  the 
optimal  schedule  length.  The  relevant  literature  is  surveyed 
ana  an  open  problem  of  Graham  (1974)  is  partially  solved. 

Chapter  3  introduces  the  processor  bouna  systems  or 
which  the  flow  shop  ana  open  shop  considered  in  the 
following  two  cnapters  may  be  consiaered  special  types.  A 
survey  in  this  area  is  provided.  Then  it  is  snown  that  the 
problem  of  scneouling  unit  execution  time  processor  bouna 
systems  is  NP-complete  even  wnen  the  precedence  constraints 
are  restricted  to  chains.  In  addition,  a  aynamic  programming 
method  is  suggested  for  scheduling  such  systems,  and  this 
leads  to  a  consideration  of  terminal  subset  enumeration,  an 


■ 
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interesting  combinatorial  problem  in  itself. 

After  a  brief  survey  of  pertinent  results,  a  number  of 
special  cases  of  the  three-stage  flow  shop  minimal  length 
scneauling  problem  is  shown  to  be  NP-complete  in  Chapter  4. 
Then  polynomial  algorithms  are  presentee  for  two  other 
cases.  The  aesign  of  the  algorithms  illustrates  the  use  of 
aojacent  pairwise  interchange  ana  enumeration. 

The  open  snop  is  dealt  with  in  Cnapter  5.  After  a  brief 
introduction  ana  survey,  the  major  part  of  the  chapter  is 
devoted  to  a  reduction  from  3-PARTITION  which  shows  the 
two-processor  minimal  mean  flow  problem  to  be  NP-complete. 
The  chapter  is  concluded  with  two  theorems  bounding  the 
performance  of  arbitrary  and  SPT  schedules  in  terms  of  the 
optimal  mean  flow  time. 

The  final  cnapter  summarizes  the  results  and  discusses 
several  suggestions  for  further  research. 


■ 


Chapter  Two 


BOUNDS  ON  SCHEDULES  FOR  INDEPENDENT  TASKS 

Perhaps,  one  of  the  most  oasic  scheduling  proolems  is 
that  of  minimizing  the  schedule  length  for  a  set  of 
independent  tasks,  (tnat  is,  no  precedence  constraints)  on 
parallel  identical  processors.  Assume  the  system  consists  of 
n  tasks,  ,  l£i£n,  with  execution  times  t± ,  i£i£n,  and  m 
identical  processors,  p^ ,  l£j£m.  in  the  single-processor 
case,  m  =  1,  tne  schedule  length  is  constant  for  all 
sequences  or  the  n  tasks  and  there  is  no  optimization 
problem,  whether  schedules  are  allowed  to  be  preemptive  or 
non-^r eemp tive .  However,  tor  m  2  2  processors,  the  problem 
can  be  solved  easily  tor  preemptive  scnedules  while  the  case 
for  non-nr eemp tive  scnedules  is  extremely  difficult.  This 
chapter  deals  with  the  problem  of  providing  tight  bounds  for 
tne  list  scneouiing  heuristic  for  the  non-preemptive 
problem . 

2.1  Survey 

Tne  well-known  linear  algorithm  for  the  preemptive 
problem  was  first  reported  by  McNaughton  (1959)  ano  for 
quite  a  while  no  comparative  results  for  tne  non-preemptive 
case  were  known.  The  reason  ror  this  became  apparent  when 
the  non-preemptive  problem  was  shown  to  be  NP-complete 
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(Bruno,  Colfman  &  Setni,  1974).  As  for  heuristics,  list 
schedules  ana,  in  particular,  LPT  scheaules  are  Known 
(Graham,  1974;  Baker,  1974)  to  have  good  performance.  Sahni 
(1976)  appliea  aynamic  programming  and  developed  an 
approximation  algorithm  for  this  problem.  More  recently, 
Coffman,  Garey  ana  Johnson  (1978)  obtainea  better 
performance  than  LPT,  in  general,  with  their  algorithm  which 
was  aevelopea  witn  iaeas  from  the  theory  of  bin-packing. 

Let  wQ  be  the  length  of  an  optimal  non-preempt ive 
scneaule,  ZQ,  ana  let  w  oe  the  length  of  an  arbitrary  list 
scneaule,  Z.  Graham  (1969,  1972,  1974)  showed  that 
w/wq  L  2  -  1/m.  This  general  oouna  is  founa  to  apply  over  a 
wiae  range  of  values  of  the  parameters  of  the  system, 
namely,  task  times,  number  of  processors,  and  priority  lists 
usea.  In  particular,  when  the  priority  list  is  modifiea  ana 
the  otner  parameters  are  kept  constant  this  bound  is 
achievable  even  if  the  ratio  of  maximum  and  minimum  task 
execution  times  is  never  more  than  4  (Graham,  1974)  .  Graham 
left  as  an  open  proolem  the  improvement  of  tne  bound  for 
lower  execution  time  ratios. 


Let  the  ratio  between  tne  longest  execution  time  and 
the  shortest  be  r  2  1 .  For  r  =  1,  every  task  in  the  system 
has  the  same  execution  time  ana,  hence,  every  list  scheaule 
is  optimal.  Thus,  in  this  case  w/wq  =1.  It  seems 
reasonable,  therefore,  to  expect  that  as  r  approaches  1,  the 
bound  on  w/w ^  can  be  reauced  below  2  -  1/m  which  applies  in 
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general.  (One  may  think  or  systems  with  small  ratio,  r,  as 
systems  with  similar  tasks.) 

In  this  chapter,  the  effects  of  lowering  the  maximum 
execution  time  ratio,  r,  are  studied  and  tignter  bounds  on 
list  schedule  length  as  compared  to  the  optimal  length  are 
derived  as  follows: 

(1)  For  r  £  3, 

2  -  1/ (3|__m/3j)  ,  for  m  i  6, 

w/wQ  £  17/10,  for  m  =  5, 

5/3,  for  m  =  3,  4. 

(2)  For  r  ^  2, 


w/w  £ 
o 


5/3  -  1/ (3|_m/2J)  ,  tor  m  i  4, 
3/2,  for  m  =  2, 


3. 


The  result  tor  r  £  3  is  proved  by  contradiction.  It  is 

snown  by  induction  on  the  number  of  processors  tnat  there 

cannot  exist  any  pair  of  schedules  2  ana  2q  whose  finish 

time  ratio  w/w  is  larger  than  the  stated  value.  If  there 

o 

exists  such  a  pair  of  schedules,  whicn  is  referred  to  as  a 
counter-example,  for  m  processors,  then  by  some  simple 
"normalization"  operations  it  can  be  moaifieo  to  produce  a 
counter-example  for  m  -  3  processors.  But,  as  will  be  seen, 
no  counter-examples  exist  for  small  values  of  m.  For  the 
second  case  where  r  £  2,  a  similar  technique  is  again 
employed.  In  both  cases,  examples  which  achieve  the  stated 


oounos  are  given. 
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2.2  Normalization 


In  this  section,  several  lemmas,  which  will  be  useful 
in  deriving  the  bound  w/wq  1  (2  -  I/p)  lor  some  value  or  p, 
are  given.  Hence  the  lemmas  use  (2  -  1/p)  as  a  tentative 
bound  without  specifying  a  value  for  p.  Since  the  bound 
(2  -  1/m)  is  known,  p  must  be  less  than  or  equal  to  m. 

In  the  following,  the  longest  and  shortest  execution 

times  will  be  denoted  by  t  and  t  respectively.  The 

Li  S 

starting  time  of  task  T  ,  the  task  with  execution  time  t  , 

Ju  Li 

will  oe  denoted  by  y,  that  is,  S(T  )  =  y.  If  several  tasks 
have  the  same  execution  time,  t  ,  one  will  be  selected  and 

Li 

it  will  oe  clear  from  the  context  which  one  it  is. 


By  a  counter-example  is  meant  a  given  task  system  with 
a  list  schedule  Z  and  an  optimal  schedule  Zq  such  that 
w/wq  >  2  -  1/p.  The  effect  of  applying  the  following  lemmas, 
a  process  called  normalization ,  is  to  transform  any  given 
counter-example  into  another  counter-example  with  a  specific 
structure . 


Lemma  2.1:  For  a  given  set  of  tasks,  T^ ,  l£i£n,  there 

exists  a  list  schedule  with  longest  finisn  time  (of  all  list 

schedules)  for  which  F'  ( T  )  =  w.  In  other  words,  T  is  the 

L  L 

last  task  or  one  of  the  last  tasks  to  be  finished. 


Proof:  Let  Z'  be  a  list  schedule  with  longest  finish 
time  w'  ano  F'  (T‘L)  /  w'  for  any  task  TL  with  the  maximum 


. 
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execution  time  t  .  (S'(T.)  and  fi(T-)  are  the  starting  and 

tinisning  times  of  T-  in  scnedule  Z').  Since  there  is  no 
precedence  relationship  Detween  the  tasks,  the  tasks  on  eacn 
processor  can  oe  sorted  in  non-aecreasing  order.  Tnis 
operation  aoes  not  change  the  length  ot  Z'. 

Now,  let  P,  be  a  processor  whicn  finishes  last  in  Z'. 

3 

Then,  there  exists  T ,  executed  on  p  witn  F'(T  )  =  w'.  Also, 

K  J  K 

assume  that  P.  is  the  processor  that  executes  a  task  T  with 

1  Li 

processing  time  t  so  that  P.  is  idle  alter  time  F (T  ) . 

Li  1  Li 

Because  of  the  manner  in  wnich  list  scheaules  are  built 
there  can  be  no  idle  time  in  Z'  before  S'(T^).  Hence 
S'  (T^)  <:  F'  (T!l)  and  t^  +  t^  i  w'  -  S'  (T^)  .  Now,  form  a  new 

schedule  Z  with  F(T_)  =  w  and  w  1  w'  as  follows.  Replace  Tr 
on  P.  by  T.  giving  a  schedule  in  whicn  the  first  idle  time 
occurs  at  MIN{  S 1  (T^)  ,F  (Tk)  j  ,  where  F(T‘k)  is  of  course  the 
finishing  time  ot  T ^  in  the  new  schedule.  Make  the  last 
task  on  processor  P.  it  F(T  )  £  S'(T  );  otherwise,  make  T 

l  k  k  .L 

the  last  task  on  processor  P  .  This  yields  the  new  scneoule 

3 

Z  with  w  1  w'  and  F ( T  )  =  w.  (See  Figure  1.  Hatched  areas 

Li 

indicate  idle  periods  on  processors.)  Since  by  nypothesis  Z' 
has  longest  finishing  time,  it  follows  that  w  =  w'.  fl 

Thus,  without  loss  ot  generality,  one  may  assume  that 
the  worst  list  schedule  has  the  form  depicted  in  Figure  2, 
wnere  U  indicates  the  total  busy  times  on  other  processors 
during  the  execution  of  T  .  There  can  oe  no  idle  periods  in 

Li 

tne  first  y  =  S(T  )  time  units.  Possible  idle  periods  in  the 

Li 
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optimal  schedule  ZQ  are  not  indicated  in  the  Figure.  For  the 
rest  of  this  chapter  assume  all  counter-examples  to  be 
represented  as  above. 

Lemma  2.2:  If  the  worst  list  schedule  Z  with  the 
corresponding  optimal  schedule  ZQ  satisfies  w/wo  >  2  -  1/p 
then  w  <  p (m-1) t  /  ( (p-l)m)  tor  m  2  2. 

O  L 

Proof :  Refer  to  Figure  2.  By  considering  the 
"processor-busy"  areas  in  Z  and  ZQ, 

mw  2  my  +  t  +  U  =  my  +  mt  -  (m  -  l)t  +0 
O  L  L  Li 

=  >  mw  2  mw  -  (m  -  l)t_ 

O  Li 

=  >  1  +  (m  -  l)t  /(mwQ)  2  w/wQ. 

Since  w/w  >  2  -  1/p,  1  +  (m  -  l)t  /  (mw  )  >  2  -  1/p  from 

O  L  O 

whicn  the  result  follows.  0 

Corollary :  If  the  conditions  of  Lemma  2.2  hold  and 
p  =  31  m/3  |,  then  w^  <  6t  /5  for  m  2  3  and  w  <  21t  /20  for 
m  2  6 . 

Proof:  There  are  three  cases  depending  on  the  value  of 
in  (moo  3)  . 


(1) 

m 

=  3i  , 

p  =  3i.  From 

Lemma 

2.2,  wQ 

<  fcL 

(2) 

m 

=  3i  +  l , 

p  =  3i.  From 

Lemma 

2.2, 

w 

o 

<  (1  + 

l/(m2-2m))t  < 

JLi 

9t_/8 

for  m  2 

4  . 

(3) 

m 

=  3i+2 , 

p  =  3i.  From 

Lemma 

2.2, 

w 

<  (1  + 

2/  (m2-3m) ) t  < 

6tf/5 

tor  m  2 

5  . 

Similarly  ror  m  2  6. 
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Lemma  2.3;  Given  a  counter-example  witn  list  schedule 

Z'  ana  optimal  scneaule  Z'  we  can  obtain  anotner 

o 

counter-example  Z,  Z  for  which  w  =  2w  -  1  and 

o  o 

y  =  w Q  -  1  +  g,  where  t  =  wq  -  g ,  01gll. 

Proof :  This  requires  only  the  application  of  a  scaling 

operation  to  all  task  times.  From  the  definition  of 

counter-examples  for  Z',  2  -  1/p  <  w 1 /w^  12-  1/m.  Let 

w'/w'  =  2  -  l/o  for  some  real  number  d,  p<dlm.  Multiplying 

all  t.  by  d/w 1  and  using  the  same  schemes  as  in  Z'  ana  2' 
i  J  o  3  o 


gives  Z,  2 
^  '  o 

with  w  =  d, 
o 

w/w  =  2 
o 

-  1/w  ,  w  =  2w  -  1 
o  0 

and 

the  ratio  w/wQ  =  w'/w^  > 

2  -  1/p. 

Furthermore,  since 

(see 

Figure  2)  w 

=  y  +  t^  ana 

fcL  ^  wo' 

y  =  wQ  -  1  +  g  and 

t  =  w  -  g 

for  some  g , 

0igll . 

In  the  following  it  is  assumed  that  the  scaling 
operation  of  Lemma  2.3  has  beeen  applied  to  the  schedules. 

Lemma  2.4:  Let  Z  and  Zq  be  a  counter-example  for  m  2  3 

ana  r  1  3.  Then,  no  processor  executes  more  than  three  tasks 

in  Z  .  Furthermore,  the  processor  which  executes  Tr  in  Z 
o  L  o 

aoes  no  otner  tasks. 


Proof :  From  the  Corollary  of  Lemma  2.2 
Suppose  some  processor  executes  four  tasks, 

Ti  (3)  '  Ti  (4)  '  ln  Zo'  then 

1  4tl/3  >  6tl/5  >  wo' 
wnicn  is  a  contradiction.  Also,  since 


w  <  6 1  /  5  . 
o  L 

T  T 

i  (1)  '  i  (2)  ' 


l  <  t^/5  <  tL/3  1  t  Him,  the  processor  which 


' 


. 
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executes  in  20  can  process  no  other  task.  Q 

Lemma_2 . 5 :  If  list  scneaule  2  and  optimal  schedule  2q 
form  a  counter-example  on  m  processors,  m  2  7,  ana 
p  =  3j_m/3J,  then,  t  2  y. 

Proof:  Since  y  =  2w_  -  1  -  tr  (by  Lemmas  2.1  ana  2.3; 

■  O  L 

see  Figure  2)  t  2  y  if  and  only  if  tr  -  ( 2w  -  1  -  t_ )  20 
or  1  +  2(t^  -  wq)  2  0.  The  last  inequality  is  easily  shown 
to  be  true  as  follows.  By  Lemma  2.2, 
wQ  <  p  (m  -  1)  tL/  (  (p  -  1 ) m) 

=>  1  +  2(tL  -  wQJ  >  1  +  2 lmwQ (p  -  1 ) / (p (m  -  1) )  -  wQ] 

=  1  -  2wq  (m  -  p)/  (p  (in  -  1)  ) 

2  1  -  2m  (m  -  p)/(p(m  -  1)  )  , 

(since  wQ  £  m  after  scaling  by  Lemma  3.3).  Thus,  it  is 
sutficient  to  show  that  G  =  2  (m  -  p)m/(p(rn  -1))  £  1 . 

For  p  =  G  =  0. 

? 

For  p  =  m  -  1 ,  G  1  1  as  2m  <;  m  -  2m  +  1  for  rn  2  6. 

For  p  =  m  -  2,  G  1  1  as  m^  -  7m  +  2  2  0  for  m  2  7. 

Hence,  for  m  2  7,  1  +  2 ( t T  -  w_)  2  0  and  tT  2  y.  0 

A  task  will  be  referred  to  as  k-par tnered  in  2q  if  it 
is  executed  with  k  other  tasks  on  the  same  processor  in  2q. 
For  a  list  schedule  2  the  definition  of  k-partners  will  be 
the  same  except  that  T  is  not  counted  as  a  partner  of  any 

Li 

other  task. 

Lemma  2.6:  Let  2  and  2q  represent  a  counter-example  as 
before,  with  m  2  7,  r  ^  3,  and  p  =  31  m/3  I.  Then,  there 


. 
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exists  a  counter-example  consisting  of  schedules  Z'  and  Z^ 

with  the  same  finish  time  ratio  w'/w'  =  w/w  for  which  the 

o  o 

following  properties  hold: 

(1)  In  Doth  Z'  and  Z^  any  2-partnered  task  is  smaller  (in 

execution  time)  than  any  i-partnered  task  which  in  turn 

is  smaller  than  any  0-partnered  task. 

(ii)  In  Z',  if  T.  and  T  are  1-partnered  and  are  executed  on 
o  1  j 

tne  same  processor,  then  if  t  2  t  then  t.  2  w  /2. 

l  3  l  o7 

Proof :  If  the  properties  do  not  hold  already,  a  numoer 
of  operations  are  performed  on  the  schedules  ano  task  times 
while  keeping  the  finish  time  ratio  unchanged. 


(i)  First  consider  the  list  schedule  Z.  with  reference 

to  Figure  2  any  modification  on  the  tasks  except  T  in  Z 

L 

which  does  not  include  idle  time  before  time  y  =  2w  -l-tr 

J  o  L 

will  yield  a  list  scneduie  with  length  not  less  than  w. 


Since  any  three  tasks  have  total  execution  time  at 
least  as  Dig  as  t  and  Dy  Lemma  2.5  +■  2  y,  it  can  be 

ensured  that  tne  smallest  tasks  are  the  2-partnered  tasks  by 
exchanging  any  larger  2-partnered  tasks  with  smaller  1-  or 
^-partnered  tasks. 

Furthermore,  any  0-partnered  task  1^  in  Z  has  the 

property  that  t ^  2  y.  Hence  if  there  is  a  larger  1-partnereo 

task  T  ,  where  t  >  t  2  y,  then  exchanging  T  and  T_  will 
3  3k  3 

still  give  a  schedule  with  finish  time  w.  Thus,  the  tasks  in 
Z  can  De  rearranged  accordingly  (yielding  schedule  Z')  such 


. 


■ 
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that  a  0-nartnered  task  is  no  smaller  than  a  1-partnerea 
task  wnicn  in  turn  is  no  smaller  than  any  2-partnered  task. 
This  concludes  the  proof  of  part  (i)  tor  Z'. 

for  ZQ  it  is  first  shown  that  a  0-partnerea  task  in  Z' 

is  also  a  0-partnerea  task  in  ZQ  and  vice  versa.  Let  T^  be  a 

0-partnereu  task  in  Z',  then  t  2  w  -  1.  ana 

L  o 

w  -  t  £  1  <  t  /3.  (From  Lemmas  2.3  ana  2.5,  after  the 

OK  JLi 

scaling  operation,  t2y2wo-l>p-12m-3;  hence 
t  >  3  for  m2  6.)  Hence  no  other  task  can  oe  executed  on 

i_j 

tne  same  processor  with  T,  in  Z  .  That  is,  a  0-partnered 

K  O 

task  in  Z  is  a  0-partnerea  task  in  z  .  Now  suppose  there  is 

a  0-partnered  task,  T,  ,  other  than  Tr  in  Z  which  is  not 
e  k  L  o 

0-^artnered  in  Z'.  As  in  the  proof  of  Lemma  2.2  a 

contradiction  is  ootainea  oy  considering  the  processor-busy 

areas  of  Zq  ana  Z'  (see  Figure  2) .  Suppose  there  are  u  2  0 

0-nartnereo  tasKS  common  to  Z'  ana  Z  .  Let  q  =  m  -  u  -  2  oe 

the  number  of  processors  (excluding  the  processors  which 

execute  T^  ana  T^)  that  execute  2-  and  1-nartnered  tasks  in 

Z  .  The  total  execution  time  on  tne  q  processors  in  Z  is  at 
o  o 

most  w  q.  Since  T,  has  a  partner  in  Z',  its  partner  must 
o  k 

taxe  at  least  t../3.  Therefore,  out  of  the  total  execution 

L 

time  on  the  q  processors  at  most  w^q  -  t^/3  must  be  shared 
oetween  q  +  1  processors  in  Z'.  Hence, 
qwQ  -  ^l/3  >  (q  +  1) (WQ  -  1) 

=>  q  +  1  >  wq  +  tL/3 

-2)  +  1  2  q  +  1  >  w  +  tr/3  2  w  +  5w  /18 

O  J_j  o  o 


=  >  (m 


I 
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=  >  T,i  -  1  >  23wq/18  2  23(m  -  2)/18  (since  wq  >  p  2  m  -  2) 
=>  18 (m  -  1)  >  23 (m  -  2)  or  5m  <  28, 
giving  a  contradiction  tor  m2  6.  Hence,  a  0-partnerea  task 
in  Z  must  also  be  a  0-partnerea  task  in  Z'  (except  for  T  ). 

O  L 

It  follows  that  the  0-nartnerea  tasks  are  the  largest  in  Z  , 

o 

since  they  are  the  largest  in  Z'. 

The  construction  of  scneaule  2^  can  now  be  specif iea. 

Let  T  be  the  smallest  1-partnerea  task  in  Z  ,  T,  its 
a  o  b 

(larger)  partner  and  T  tne  largest  2-partnerea  task  in  Z  . 

c  o 

If  t  2  t  ,  then  (i)  holds  for  Z,  ana  Z'  is  identical  to  Z  . 
a  c  o  o  o 

Suppose  t  <  t  .  Note  that  t  1  w  -2t.r/3  since  t_/3  is  the 
a  c  coL  l! 

minimum  possible  length  for  each  of  its  partners.  Hence, 
t  <  -  2tr/3.  Now,  increase  t  to  W’  -  2tr/3  exactly  thus 

d  O  L  d  O  L 

forcing  t  2  t  .  Since  t  +  t,  £  w  ,  t,  must  be  reduced,  if 
r  a  c  a  b  o  b 

necessary,  to  2t  /3.  This  operation  of  setting 

Li 

t  =  w  2t.r/3  and  t,  =  2tr/3  does  not  change  w  .  In  Z', 
a  o  L  b  L  ^  o 

increasing  any  task  time  will  not  reduce  w  and  thus  w/wq 
does  not  decrease.  Also,  it  is  obvious  that  T,  can  be  a  2- 
or  1-nartnered  task  in  2'  and  decreasing  a  2-partnered 
tasK's  time  to  2tL/3  will  not  reduce  w.  For  a  1-partnered 
task,  even  it  tne  otner  partner  has  minimum  time,  t  /3, 

Li 

tneir  total  length  must  be  at  least  t  2  y,  ensuring  no 

Li 

reduction  in  w.  The  operation  can  be  repeated  until  there  is 
no  1-partnereo  task  that  is  smaller  than  a  2-partnered  task 
in  Zq,  tnus  obtaining  scneaule  2^.  This  concludes  part  (i) 


for  Z '  . 
o 


(ii)  Here, 

note 

that  if  the  larger 

of  a 

pair 

of  tasks 

executed  on  the 

same 

processor  in  Z  is 

o 

less 

than 

w  /2  then 
o 

there  must  be  idle  time  on  that  processor.  Increase  the 
larger  task's  time  by  the  amount  of  idle  time  on  the 
processor.  Again,  wQ  does  not  increase  and  w  does  not 
decrease.  0 

The  last  of  the  normalization  results  imposes  an  order 

on  the  1-nartnered  tasks  in  both  schedules  Z  and  Z  of  a 

o 

counter-example . 

Lemma  2.7:  Let  Z  ano  Z  be  a  counter-example  as  before. 

- — . — —  o 

The  1-partnered  tasks  in  Z  and  Z  can  be  ordered  so  that  the 
^  o 

largest  is  a  partner  of  the  smallest,  the  next  largest  a 
partner  of  tne  next  smallest  ano  so  on.  The  result  remains  a 
counter-example . 

Proof:  Tne  schedules  are  modified  with  the  aid  of  two 
operations  similar  to  those  employed  by  Graham  (1974). 

Operation  I: 

Sort  the  tasks  on  each  processor  in  non-oecr easing 
oroer.  This  does  not  change  the  schedule's  length. 

Operation  II: 

Let  T.  and  T  oe  partners,  ano  similarly  for  T  and  T  . 

i  j  y  ii 

(See  Figure  3.)  If  t.  <  t  and  t  <  t  ,  excnange  T  •  and 

i  y  j  ^  j 


FIGURE  3  :  Operotl on  II  of  Lemma 
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Apply  Doth  operations  iteratively  until  there  are  no 

indices,  g,  h,  1,  j  ,  for  wnich  operation  II  can  oe  applied. 

It  is  easy  to  see  that  the  effect  is  to  delay  the  first 

occurrence  of  idle  time  and  possibly  decrease  the  schedule's 

length.  Also,  only  a  finite  number  of  operations  will  be 

done  (Graham,  1974).  Since  ZQ  is  optimal,  wq  will  remain 

unchanged.  Note  that  in  Z,  T  is  not  considered  for  aoove 

Li 

operations  (recall  that  T  is  not  considered  a  partner  of 

Li 

any  other  task).  Hence,  the  relevant  effect  of  operations  I 
and  II  on  z  is  the  possiole  delay  of  first  occurrence  of 
idle  time  y.  Since  y  is  not  decreased,  w  will  not  be 
decreased.  0 


Tnis  concludes  the  presentation  of  tne  normalization 

lemmas.  Let  p  =  3|_m/3j,  r  L  3,  m  2  7.  Then,  if 

counter-examples  exist,  consider  the  worst  case.  By 

application  of  the  previous  lemmas,  it  can  be  transformed 

into  a  counter-example  with  scnedules  Z  and  Zq  for  which 

w  =  2w  -  1,  the  2-partnered  tasks  are  the  smallest  in  both 

Z  and  ZQ,  and  the  0-partnered  are  the  largest.  Furthermore 

the  1-partnered  tasks  in  both  schedules  are  paired  in  an 

orderly  manner  and  can  be  represented  as  in  Figure  4.  Note 

that  tne  set  of  tasks  { T .  I  ILiln}  -  { T  }  occupy  one  more 

1  L 

processor  in  Z  than  in  Z  .  Also,  recall  (from  proof  of  Lemma 
2.6)  that  a  task  is  0-nartnered  in  Z  if  and  only  if  it  is 
0-partnereo  in  Z.  Hence,  this  1-processor  gain  must  be 
achieved  by  naving  a  difference  of  exactly  6  tasks  in  the 
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division  Detween  1-  and  2-partnerea  tasks.  It  follows  that 
there  are  at  least  two  processors  in  2^  that  perform  three 
tasks  each  (exactly  two  more  than  in  Z,  not  counting  T  in 

1j 

2)  . 


Furthermore,  it  is  guaranteed  that  at  least  three 

processors  in  2q  perform  two  tasks  each  for  the  following 

reason.  Suppose  there  are  k  <  3  processors  that  perform  two 

tasks  each  in  Z  .  Then,  the  number  of  1-nartnered  tasks  is 

o 

2k  in  Z  ano  2k  +  6  in  Z ,  where  the  additional  6  tasks  in  2 
o 

are  2-partnered  in  ZQ .  Now,  for  k  <  3,  these  six  tasks  are 
greater  in  number  than  k  +  3,  the  number  of  processors  whicn 
execute  the  2k  +  6  tasks  in  2.  hence,  two  of  the  six  tasks 
wnich  are  2-partnered  in  Z  must  be  executed  on  the  same 
processor  in  2.  Their  total  time  is  at  most 

2t  =  2(w  -  2t  /3)  ,  using  t  of  Lemma  2.6.  Using  w  >  ^  (by 
c  o  L  c  o  J 

Lemma  2.3)  and  the  second  part  of  the  corollary  to  Lemma 
2.2,  it  is  easily  shown  that  this  number  is  less  than 
w  -  1,  and  hence  is  not  sufficient  for  y  (that  is, 
contradicting  the  fact  that  there  is  no  idle  time  in  Z 
before  time  y) . 


In  the 
r  1  3  and  r 


next  section,  tignt  bounds  for  systems  with 
£  2  are  proved  using  normalized 


counter-examples . 
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2.3  bounds  tor  Similar  Tasks 

The  following  relations  whicn  apply  to  a  normalized 
counter-example  should  be  kept  in  mind  while  studying 
suosequent  proors  in  this  section. 

(1)  wQ  <  p(m  -  1) t  / ( (p  -  1 ) m)  for  m  2  2.  (Lemma  2.2) 

(2)  w  <  6t  / 5  for  p  =  31  m/3  |,  m  2  3.  (Corollary  of 

Lemma  2.2) 

(3)  w  =  2w  -  1,  y  =  w  -  1  +  g,  t  =  w^  -  g ,  where  0£g£l. 

(Lemma  2.3) 

(4)  p  <  wQ  i  m.  Hence,  wQ  2  6  for  p  =  3|__m/3j,  m  2  7. 
(follows  from  Lemma  2.3,  since  for  a  counter-example, 

2  -  1/p  <  w/wQ  2  2  -  1/m) . 

(5)  wq  2  t  2  y  2  wq  -  1  tor  p  =  3j_m/3J,  m  2  7.  (from 

Lemmas  2.3  and  2.5). 

2.3.1  Tasks  with  Largest  Execution  Time  Ratio  <;  3 

Informally,  the  proof  for  the  bound  for  m  2  6  runs  as 
follows:  If  a  counter-example  exists  tor  any  m,  m  2  6,  then 
another  can  be  constructed  for  m  -  3  processors  (Lemma  2.8). 
hence,  by  running  througn  a  series  of  constructions  a 
counter-example  for  m  =  6,  7  or  8  '-^n  be  produced.  But  in 
Lemma  2.y  it  is  shown  that  no  counter-examples  exist  for 
m  =  6,  7  or  8  thus  giving  a  contradiction.  The  bound  is 
shown  to  oe  tight  by  demonstrating  some  examples. 

Lemma  2.8:  Let  2,  ZQ  constitute  a  normalized 
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counter-example  tor  m  processors,  where  p  =  3|_m/3j,  m  2  9 
ana  r  £  3.  Then,  one  can  construct  a  counter-example  tor 
m  -  3  processors. 


Proof:  Note  that  all  results  of  previous  lemmas,  where 
dependent  on  m,  are  valid  for  m  2  7,  and  also  there  is  no 
need  to  consider  those  cases  where  m  is  a  multiple  of  3 
since  the  Oouno  in  these  cases  is  exactly  2  -  1/m  which  has 
been  proved  (Graham,  1974) . 

The  counter-example  for  m-3  processors  is  constructed 
in  two  steps.  First,  six  of  the  1-partnered  tasks  in  Zq  are 
deleted  (recall  that  the  existence  of  at  least  six  of  them 
has  been  determined).  Second,  the  execution  times  of  some  of 
tne  tasks  are  reauceo  siigntly.  It  must  be  shown  that  in  the 
resulting  tasK  system  the  ratio  r  is  still  less  than  or 
equal  to  3,  and  that  the  resulting  schedules  do  form  a 
counter-example  for  m-3  processors. 


Tne  tasks  to  De  deleted  are  precisely  those  in  the 

middle  range  of  the  1-partnered  set  in  Zq,  namely, 

{T'i+u+1 ,  .  .  .  ,Ti+u+6}  (See  f  igures  4,  5).  Tne  task  times  of 

the  remaining  tasks  will  tnen  be  reduced  thus: 

t  -  1 ,  if  3  b  i+u , 

3 

t'  =  t  -  2,  if  i+u+7  £  j  £  i+2u, 

3  3 

t  -  3,  if  j  >  i+2u. 

3  J 


i\l 


ote  that  after  task  time  reduction,  for  every  pair  (of 


1-partners)  aoove  which  has  not  been  Disrupted  by  task 


. 
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deletion  the  total  processor  time  is  reduced  Dy  3.  But  the 
tasks  { rI^+u_ ^ r  •  •  •  f Ti+U J  lose  their  partners  in  Z. 


Now , 

ti+u-5 

+ 

t . 

l  +  U  +  b 

2 

Y  i 

w  -  1  . 

o 

Similarly 

'  fci+u 

+ 

i+u+1 

2 

w 

o 

1. 

Hence, 

ti+u-5 

+ 

t  ■  + 

l  +  U 

t 

i+u+1 

+  ^"1+0+6 

2  2w 

o 

-  2 . 

But 

t i+u+1 

+ 

t . 

i+u+6 

w 

o 

(Deing  a 

pair  in 

V  • 

Bence , 

ti+u- 5 

+ 

fci+u 

2 

wo  - 

2  . 

Similarly 

,  it  can  oe  : 

shown  tnat 

t l+U- 4 

+ 

ti+u-l 

2 

w 

o 

2, 

and 

t  .  _ 

l+u- 3 

+ 

t . 

i+u-z 

2 

w 

0 

2  . 

Since  the 

aoove  six  tasks  are 

among  the  tasks 

r  eouceo 

DY  1, 

t 1 

l+u- 5 

+ 

t ! 
l  +  U 

2 

w 

o 

4, 

t ! 

i+u-4 

+ 

fci+u-l 

2 

w 

o 

4, 

and 

t !  ,  0 

l+u-  3 

+ 

fci+u-2 

2 

wo 

4. 

But  this  is  equivalent  to  the  reduction  in  execution  time  on 
the  other  processors  in  Z.  Bence,  the  tasks  which  are  lett 
with  no  partners  in  Z  can  be  paired  against  each  other  .  This 
gives  two  new  schedules  Z',  Z^  for  m  -  3  processors  with 
w'  =  w  -  6,  w'  =  w  -  3  ana 

r>  o 

w'/w£  =  (w  -  6)/(w0  -  3)  =  2  -  l/(wQ  -  3)  >  2  -  1/ (p  -  3). 

It  is  obvious  that  Z'  is  indeed  a  list  schedule  for  the 
reduced  set  of  tasks.  Hence,  if  it  can  oe  shown  that  the 
task  time  ratios  are  less  tnan  or  equal  to  3  for  Z ' ,  Z^, 
then  they  constitute  a  counter-example  for  m-3  processors. 
This  is  done  Dy  showing  that  (I)  t^  =  MAX {*".•}/  (II) 
t'  =  MIN{ t ! } ,  and  t'/t\  2  3.  Note  that  the  { t !  }  are  in  three 
non-decreasing  sequences: 
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{ 1 1 , . fti+u^  ~  obtained  by  subtracting  1, 

{t^+  . . . . ft|+2U)  “  obtained  by  subtracting  2, 

and  { tj  +  2u+i '  *  *  * ftn=tL^  ~  obtained  by  subtracting  3. 


(I)  Show  t  '  =  MAX { t  ? } ,  l£i£n. 

L  X 

t^  is  obviously  the  largest  element  of  the  thira  sequence. 
Now  t^+^  1  wq/3  (see  Figure  4) .  t^  +  b  is  the  largest  of  the 
2-partnereo  tasks  in  ZQ  wnich  are  1-partnered  in  Z.  If  it  is 
less  than  wQ/3,  then  by  reasoning  along  the  lines  of  Lemma 
2.6,  part  (li),  it  can  always  be  forceo  to  equal  this  number 
at  the  ena  of  normalization  without  changing  the 
counter-example  status  of  the  schedules. 


Hence , 

ti+?  2  wQ/3,  since 

ti+7  1  ti+6* 

But 

t .  _  +  t .  ,  n  ^  w 

1+7  1+2U  o 

(a  pair  in  ZQ 

Therefore , 

t  £  2w  /3 

i  +  2u  o' 

or 

t!  _  £  2w  /3  -  2 

i+2u  o' 

£  (2/3)  (6tL/5  -  2) 

(from  the  Corollary  of  Lemma  2.2) 


Since 


=  (4tL  -  10)/5. 

fcL  =  ^L  -  3  =  (5tL  ~  15)/5' 
tL  1  ti  +  2u 


it  ana  only  if  5 t^  -  15  2  4t^  -  10  or  tL  2  5 
But  for  m  2  6  ana  hence  tor  p  2  6, 


6 1  /  5  >  w  >  n  =  >  >  5 . 

L  o  L 

'inererore,  t^  is  at  least  as  large  as  t|+^u* 
Furthermore,  (see  Figure  5) 

ti+u  +  ti+u+6  £  i+u  +  1 i+u+  7  £  wo 


r 
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ana 

l+u 

t . 

i+u+6 

Hence , 

fci+ 

V2 

and 

fci+u 

V2  - 

1  £ 

(3tL  -  5 ) / 5 . 

It  follows 

that  t' 

L 

i  s 

larger 

than 

t!  for  tT  2  5  or  m2  6. 

l+u  L 

Thus,  t'  is  the  largest  of  the  new  task  times. 
Li 


(II)  Show  tj  =  MIN{t|},  liiin. 
t |  is  obviously  the  smallest  of  the  elements  of  the  first 
sequence.  Now,  for  j  >  i  +  2u ,  task  T_  is  0-partnered  in  Z. 
Hence,  t  2  wQ  -  1 .  This  implies  that 

t^  2  wQ  -  4  i  w0/3  -  1  (tor  w 0  2  6) 

2  h  -  1  ■  4- 

Thus  t|  is  smaller  than  the  smallest  element  in  the  thira 


sequence.  Now,  t^+u+^  (see  Figure  5)  is  the  larger  of 


two 


tasks  executed  on  the  same  processor  in  Zq  ana  by  Lemma  2.6, 
part  (ii)  , 


This  implies 


t.  n  2  w  /2. 
i+m  +  7  o' 

t!  2  w_/2  -  2 

i+u+7  o 


2  w  /  3  -  1 
o 


(true  for  wQ  2  6) 


2  t,  -  1  = 


tl 


Thus  t|  is  the  minimum  of  the  new  task  times. 


(HI) 

tL/tl 

3 

A 

II 

V 

fcL 

£ 

3tl 

<  =  > 

I'3 

£ 

3(t1-l) 

<  =  > 

t  ' 

L 

£ 

3 1 1 . 

hence , 

1 

3 . 

. 

n 
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Lemma  2.9:  No  counter-examples  exist  lor  p  =  6,  r  £  3 
ana  m  =  6,  7  or  8. 

Proof :  The  approacn  is  to  show  that  the  value 
attainable  for  y  in  any  example  is  not  large  enough  to  yield 
a  counter-example. 

m=6 ;  The  bouna  is  already  proved  for  multiples  of  3 
(Graham,  1 y  7  4)  . 

m=  7 , 8 ;  Again,  note  that  all  the  normalization  Lemmas 
nolo  for  m  2  7.  After  application  of  these  lemmas  a 
counter-example  with  2u  i-partnered  tasks  in  Z,  u  £  m,  is 
obtainea.  It  is  easily  verifiea  (using  the  relationships  in 
Figure  4;  the  result  is  cue  to  tne  fact  that  u  £  m  is  too 
small)  that  at  least  one  of  the  tasks  T .  T  .  .  ,  has  a 

l+l  l+o 

partner  in  Z ,  T  say,  wnere  T  is  either 

a  a 

(i)  1-nartnered  in  Z  ana  is  smaller  than  its  partner  in  Z 

o  o 

and  hence  t  L  w  /2  or 
a  o' 

(ii)  T  is  one  of  the  aoove  mentionea  six  tasks.  Each  of  the 
a 

execution  times  t ,t .  is  less  than  or  equal  to 

l+l  l+o 

w  -  2tr/3  (maximum  for  a  2-partnerea  task).  Hence  for 
o  L 

this  case  t  is  also  less  than  or  equal  to  wQ/ 2. 

Hence,  y  £  (wQ  -  2tL/3)  +  wQ/2 

=  (9wQ  -  4tL)/6, 

ana  w/w  =  (y  +  t  ) /w 

o  L  o 

i  ( 9w  +  2t  )  /  ( 6w  )  <:  2  -  1/6. 

O  o 

Thus,  the  supposea  counter-example  cannot  be  a 


■ 


' 
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counter-example;  contradiction.  0 

Tneorem  2.1:  If  r  £  3  and  m  2  6,  then  w/w  £  2  - 

- - -  o 

1/  ( 3jjrn / 3 _ j )  .  This  Pound  can  oe  achieved. 

Pr oof :  By  Lemma  2.8  if  a  counter-example  exists  for 
m  ^  9,  then  counter-examples  for  m  -  3,  m  -  6,  .  ..,  (8  or  7 

or  6)  can  be  constructed.  But  by  Lemma  2.9  no 
counter-examples  exist  for  m  =  6 ,  7  or  8.  Contradiction. 

To  complete  the  proof  of  the  theorem  it  remains  to  show 
tnat  the  bound  is  achievable.  This  is  shown  schematically  in 
Figure  6.  fl 

Tne  second  theorem  considers  tne  special  cases 
ni  =  4,  5 . 

Theorem  2.2:  Assume  r  1  3.  If  m  =  4  then  w/w  £  5/3 

- — — — — —  o 

(i.e.  2  -  1/3)  and  if  m  =  5  then  w/wQ  £  17/10  (i.e. 

2  -  3/10).  The  bounds  are  tight. 

Proof:  Consider  first  the  case  for  m  =  4. 

m  =  4 :  Note  tnat  Lemmas  2.1,  2.2,  2.3  &  2.4  apply  for 
m  =  4.  The  set  of  tasks  except  T  must  be  executed  on  one 
more  processor  in  Z  than  in  Z  with  no  idle  time  before  y. 

It  this  gain  in  processors  is  achieved  by  having  a 
non-0-partnered  task  T^  in  ZQ  become  0-partnered  in  Z,  then 
tA  i  wQ  -  tL /3. 

y  £  t .  £  w  -t/3 

2  1  O  L 


hence , 
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.w  =  6  k  “  1 


w/wo=2~l/3lip/3J,  m^6,  r  =  3 

FIGURE  6:  Examples  which  achieve 

the  bounds  of  theorem  2.1 
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ana  w/wq  =  (y  +  tL)/wQ 

*  (3w0  +  2tL)/(3wQ)  i  2  -  1/3. 

Hence,  tor  a  counter-example,  a  task  which  is  0-partnerea  in 
Z  is  0-^artnerea  in  Z  .  The  reverse  can  also  be  shown  by 
argument  similar  to  that  in  the  proot  of  Lemma  2.6.  As 
before  there  must  oe  at  least  two  processors  in  Z^  (exactly 
two  more  than  in  Z)  doing  three  tasks.  There  are  thus  only 
three  possibilities.  See  Figure  7  tor  sketch.  Label  task 
times  for  tasKs  with  two  partners,  for  the  1-partnered 
ana  for  tne  0-partnerea. 

(a)  Refer  to  Figure  7.  The  six  tasks  V^,  l£i£6  must  be 

executed  on  three  processors  in  Z.  Since  there  can  be 
no  idle  time  before  y. 


y  £  (1/3)S“=1V.  £  2wq/3 


and  w/wQ  =  (y  +  tL)/wQ  £  (2wq/3  +  tL)/wQ 

£  (2w0/3  +  »0)/wQ  =  2  -  1/3. 

Hence,  sucn  a  counter-example  aoes  not  exist. 

(b)  In  this  case,  since  V\U  must  oe  pairea  with  at  least 

two  of  the  V .  the  remaining  tasKs  whicn  are  executed  on 
the  second  processor  give 

y  *  (Vi(l)  +  Vi(2)  +  Vi(3)  +  Vi(4))/Z 
*  (2wq  -  2tL/3)/2 

since  t  /3  is  the  minimum  task  time  for  each  of  the  V- 
already  paired  up.  Hence,  y  1  wQ-t  /3  ana  as  before 
w/wq  £  2-1/3,  ana  no  such  counter-example  can  exist. 

(c)  Here  the  nine  2-partnered  tasks  have  to  be  executed  on 
four  processors  in  Z.  At  least  one  processor  must 
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(b) 


FIGURE  7: 

Possible  ZQ  schedules,  m=4,  r^3 
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execute  three  of  the  V..  The  minimum  total  time  for 

1 

these  three  is  t  .  Therefore,  the  remaining  six  tasks 

Li 

which  are  executea  on  three  processors  yield 

y  <:  1 3  ( t  L  +  e)  -  tL)/3 

and  hence  w/w  L  2  -  1/3. 

o 

Figure  8  shows  the  bound  for  m  =  4  to  be  tignt. 


m  =  o:  The  discussion  at  the  start  of  case  m  =  4 


applies  leavj 

mg 

s  IX 

possi 

Dilities  for  a  counter 

-example  as 

in 

Figure  9. 

(a) 

Similar 

to 

case 

(a) 

for  m  =  4. 

U» 

Similar 

to 

case 

(b) 

for  m  =  4. 

(c) 

here  there 

are 

tasKS 

,  l£i£4,  which  are 

1-par tnered 

and  Vt  ,  l£i£6,  2-partnered  in  ZQ.  Since  all  Vt  ,  V\A  must 

have  partners  (see  discussion  at  start  of  m  =  4)  at 

least  two  of  the  must  again  be  partners  in  Z.  Since 

any  two  V.  take  at  most  2(w  -  2t.r/3)  , 

1  l  o  L 

w/wQ  £  (2(wq  ~  2ti/3)  +  tiJ)  / wo 

=  ( 5wQ  +  e) / (3wQ) . 


As 

4w^ 

o 

2 

sy  < 

w/wQ 

= 

(y  +  tL)/w0 

< 

(4w  /5  +  t  )/w  <  4/5 

O  L  O 

+  t  /w  . 

L  O 

Assume 

w/w 

o 

> 

17/10  so  that  17/10 

< 

4/5  +  t/w  . 

L  o 

Then, 

9/10 

< 

Vwo'  e  =  wo  -  tL 

< 

wo/10. 

ana 

w/w 

o 

(5wq  +  e)  /  (3wq) 

< 

5/3  +  1/30  =  17/10. 

Contradiction . 

Bence , 

w/w 

o 

<: 

17/10. 

43 


LOIOO 


FIGURE  8  :  Worst  cos©  for  m  =  4  ,  r^3 


tasks  per  processor 
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FIGURE  9:  Possible  schedules,  m=5,  r^3 
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(a) 

(e) 


Similar  to  case  (c)  tor  m  =  4. 

Here,  one  can  see  tnat  two  of  the  smaller  tasks  V.  must 
oe  paired  with  W.  ,  leaving  seven  tasks  to  oe  shared 
between  tnree  processors.  Since  one  ot  these  tnree 
processors  must  have  three  tasks  there  remains  four 
small  tasks  (V^)  for  two  processors.  Even  if  ail 
previously  assigned  tasks  have  minimum  execution  time  , 
t  /3,  this  still  yields 

Li 


y  <1  [ 3 (tL  +  ~)  -  5tL/3]/2 

=  (4t  +  9e)/6. 

w/w  E  ( 1 0 1  +  9e)/(6tr  +  6e) 

o  L  L 

i  2  -  1/3  <  2  -  3/10. 


(t)  Here,  five  processors  must  share  twelve  tasks.  At  least 
two  or  the  processors  must  have  at  least  three  tasks 
eacn  leaving  no  more  than  six  tasks  for  three 
processors.  Again,  even  if  tnose  tasks  assignee  three 
per  processor  take  t  /3  each  this  leaves 

Li 

y  <:  (4  ( tL  +  e)  -  6tL/3)/3 
and  w/wQ  £  2  -  1/3. 

The  seneeuie  in  Figure  10  shows  tne  bound  to  be  tight. 

0 


■ 
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FIGURE  10:  Worst  cos©  for  m  =  5  .  r  ^  3 
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2.3.2  Tasks  With  Largest  Execution  Time  Ratio  1  2 

In  this  sub-section,  the  case  when  r  £  2  is  considered. 
It  is  first  shown  by  contradiction  that  w/w Q  £  5/3  -  2/ (3m) 
and  that  tnis  bouna  is  best  possible  for  even  m. 

Subsequently ,  a  technique  similar  to  that  used  in  the 
previous  suo-section  is  used  to  prove  the  tignt  bound 
5/3  -  2/'(3(m  -  1))  for  odd  m. 

Lemma  2.10:  If  w/wq  >  5/3  -  2/ (3m)  and  r  £  2  then 
(i)  tL  >  2wq/3 . 

(ii)  no  processor  in  Zq  will  execute  more  tnan  two  tasks  and 
tne  processor  with  T  executes  no  other  tasks. 

Proof :  (i)  Rewrite  the  ratio  as  w/wq  >  2  -  1/p  with 
p  =  3m/  (m  +  2).  by  Lemma  2.2,  w  <  (p(m  -  l)t  )/((p  -  l)m). 

O  Lj 

Substituting  p  into  the  inequality  gives  tr  >  2w^/3. 

Li  O 

(ii)  If  there  exists  one  processor  which 

executes  more  tnan  two  tasks  in  Z  or  if  the  processor  with 

o 

T  executes  more  than  one  task,  then  the  length  of  Z  is  at 
L  o 

least  3t  /2  >  w  ,  which  is  a  contradiction.  Thus  (ii)  is 
L  O 

ootained  for  r  £  2.  0 

Theorem  2.3;  If  r  1  2  ana  m  2  4,  then 
w/wq  <1  5/3  -  2/  (3m). 

Proof:  By  contradiction.  From  Lemma  2.1  the  tasks  on 
(m  -  1)  processors  in  ZQ  will  be  executed  on  m  processors  in 
Z.  Furthermore,  if  w/wq  >  5/3  -  2/ (3m),  it  is  obvious  from 


. 
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Lemma  2.10  that  at  least  one  of  the  1-partnered  tasks  in  Z 
will  oe  0-partnerea  in  Z.  As  the  processing  time  for  the 
1-nartnerea  tasks  in  Z  is  at  most  w _  -  tT/2,  it  follows 

O  O  ±j 

tnat  w  1  w  -  t_/2  +  t_  =  w  +  t  /2  ana  w/w  1  1  +  t/(2w  )  , 
o  l  L  o  L  o  L  o 

which  is  less  than  or  equal  to  5/3  -  2/ (3m)  for  m  i  4.  Thus 
the  theorem  is  proved.  [j 


Theorem  2.4:  If  r  £  2  and  m  1  4 ,  then  w/wq  £  3/2. 


Proof :  By  contradiction.  Suppose  w/wq  >  3/2. 
Lemma  2.2,  w  <  (p(m  -  1) t  )/((p  -  l)m)  and  for  p 

O  L 

m  £  4,  this  gives  w  <  3tr/2.  Hence,  no  processor 

o  L 

more  than  two  tasks  in  Z  ana  the  processor  which 

T.  executes  no  other  tasks.  But  this  implies  w/w 
l  o 

(similar  to  the  proof  in  Theorem  2.3)  which  is  a 
contradict  ion . 


Then  by 
=  2  , 

executes 
executes 
£  3/2 

0 


The  bounds  are  best  possible  by  considering  the 
examples  given  in  Figure  11  for  m  =  2,  3  and  2k,  k  i  2. 

However,  the  bound  5/3  -  2/ (3rn)  is  not  tignt  for  odd 
m  ^  5.  A  tight  bound  5/3  -  2/(3(m-l))  can  be  proved  for  oad 
m  1  5  by  arguments  similar  to  those  used  in  Lemmas  2.8  and 
2.9  ana  Theorem  2.1. 


Theorem  2.5:  If  r  £  2  and  m  ^  5,  m  odd,  then 
w/wQ  £  5/3  -  2/(3 (m  -  1) ) . 

Proof:  The  proof  is  similar  to  that  of  Lemmas  2.8,  2.9 
ana  Theorem  2.1  ana  is  therefore  only  sketched  here. 
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First  note  tne  following  points: 

(a)  Lemma  2.1  applies  ana  hence  w  =  y  +  t  . 

±J 

(b)  w  >  3t. /2  for  otherwise  it  can  be  shown  by  Lemma  2.10 

or, 

that  w/w  £  3/2. 
o 

(c)  Given  counter-example  scneaules  Z  and  Z  , 

w/wq  >  5/3  -  2/(3(m-l))  but  it  nas  already  been  proved 
that  w/wQ  £  5/3  -  2/(3m).  hence, 

5/3  -  2/(3(m-l))  <  w/w  f.  5/3  -  2/ (3m).  Scale  the  tasks 

of  the  counter-example  scheoules  such  that 

3(m-l)  /2  <  wq  f.  3m/2  and  w/w Q  =  5/3  -  l/wQ.  Then 

w  =  5wq/3  -  1,  t^  =  2wq/3  -  g,  for  some  g  20  (see  (b) 

aoove)  ana  y  =  w  -  1  +  g. 

o 

(a)  There  are  no  0-partnered  tasks  in  Z  and  Z_,  and  Tr  has 

O  Li 

a  partner  in  z  .  First  consider  scneauie  Z.  If  task  f ^ 

is  0-partnerea  then  t  2  y.  But  t  £  t  .  hence,  t  2  y 

K.  K  ±J  Li 

ana  w  =  y  +  t.  £  2t.  By  (b)  above,  w/w  £  2 ( 2w  /3)/w 
1  L  L  O  O  O 

=  4/3  which  is  less  than  or  equal  to  5/3  -  2/(3(m-l)) 
for  m  >  3,  giving  a  contradiction. 

For  scneauie  Z  ,  if  It  has  no  partner  then  totai 

o  L 

execution  time  of  tne  remaining  m  -  1  processors  in  Zq 

is  (m  -  1)wq.  hence  in  scheduling  the  tasks  in  Z,  y 

becomes  at  most  (m  -  l)w  /m.  Therefore 

o 

w/w  =  (y  +  tT)/w  is  at  most 
'  o  w  L  o 

(  (m  -  l^w  /m  +  t ,  )  / w  f:  (m-l)/m  +  2/3,  (by  (b)  )  .  This 
O  J_i  o 

is  a  contradiction  since  (m  -  1 ) /m  +  2/3  is  less  than 
5/3  -  2/(3(m  -  1))  for  m  >  3.  The  case  tor  any  other 


. 


51 


task  not  having  a  partner  in  ZQ  can  be  ootainea  by  a 
similar  consiaeration  of  total  busy  periods  of  ail 
processors . 

(e)  There  is  at  least  one  processor  in  Zq  which  performs 
three  tasks.  Otherwise,  as  in  the  proof  of  Theorem  2.4, 
w/wq  £  3/2. 

(f)  Given  counter-example  schedules  Z,  ZQ,  either  the 
following  hold  or  another  pair  of  schedules  Z',  Z^  can 
be  constructed  for  which  they  hold. 

(i)  Any  2-partnered  task  (or  T  's  partner)  is  smaller 
than  any  1-nartnered  task. 

(ii)  If  a  1-partnerea  task  is  larger  tnan  or  equal 
to  its  partner  then  t  2  wQ/2.  The  proof  only 
requires  arguments  similar  to  those  of  Lemma  2.6. 

(g)  The  l-partnereo  tasks  can  be  oraerea  in  both  Z  ana  ZQ 
by  Lemma  2.7. 

The  aDOve  tacts  leaa  to  the  contiguraton  in  Figure  12 

for  a  counter-example.  A  processor  reauction  by  two  is  aone 

by  aeieting  tasks  {T.,  .  Tnis  is  followed  by 

reaucing  task  times  thus: 

t  -  1,  j  i  i+u, 
t '  =  J 

J  t^  -  2,  j  2  i+u+5, 

which  leaas  to  a  counter-example  for  m  -  2  processors. 

For  the  initial  case,  m  =  5,  the  proposed  bound 
5/3  -  2/ ( 3 (m-1) )  is  equal  to  3/2.  All  preprocessing  of 
counter-examples  before  the  processor  reduction  stage  are 


. 


“partnered  tasks  In 
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touna  to  be  valid  for  m  =  5.  Thus  (see  Figure  12)  one 
obtains  a  pair  of  tasks  ,  l\  (1-partners  in  Z) ,  for  which 
T  ^  is  2-partnerea  in  ,  T_^  is  1-partnered  in  Zq  and  is 
smaller  than  or  equal  to  its  partner  in  Z  .  Now  t.  £  w  -  tT 
since  its  two  partners  must  take  at  least  t^/2  each,  and 

s  ‘  V2* 

Hence,  y  £  t.  +  t  £  w  -  tr  +  w  /2 

l  3  o  L  o 

ana  w/w  £  3/2. 

o 

The  bound  is  tignt  by  Figure  13.  Q 

2.4  Discussion 

THis  chapter  has  investigated  the  benaviour  of  tight 
bounds  on  list  schedules  of  independent  tasks  on  m  identical 
processors  as  the  degree  of  similarity  between  the  tasks' 
execution  times  is  varied.  Out  of  this  investigation  tne 
r elationsnips  in  Figure  14  emerge.  As  might  oe  intuitively 
expected,  the  worst  case  neuristic  schedule  length 
approacnes  the  optimal  as  the  ratio  r  is  reduced.  Note 
nowever  that  even  for  a  ratio  as  low  as  2,  the  heuristic  can 
still  take  up  to  3/2  times  as  long  as  the  optimal. 

Another  interesting  point  to  note  is  that  the  examples 
wnicn  illustrate  tne  tightness  of  the  bounds  all  attain  the 
maximum  ratio  allowed,  an  indication  that  tne  oounas  might 
be  tightened  further  for  values  of  r  in  oetween  the  integer 


values  considered. 
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FIGURE  14:  Worst  rot  Los  vs  r 


Chapter  Three 


PROCESSOR  BOUND  SYSTEMS 


Consiaer  the  problem  of  scheauiing  a  set  of  n  tasks, 

{ I'i  ,T2  /  .  •  . /T'n  1  on  a  multiprocessor  computer  system  that  has 
m  2  2  processors  {P^ ,P^ / . . . , Pm)  capable  of  independent 
operation  on  independent  tasks.  As  usual,  it  is  assumed  that 
there  is  a  partial  ordering,  <,  specified  on  the  set  of 
tasks  in  the  form  of  a  directed  acyclic  graph.  In  addition, 
it  is  also  assumed  that  no  two  tasks  are  identical.  Thus, 
tor  each  task  T^  a  processor  index,  R(T1),  is  specified  so 

that  task  T.  must  be  executed  on  the  R(T.)-th  processor. 

1  1 

Such  a  system  is  called  an  m-pr ocessor  bound  system.  This 
model  covers  such  well  known  instances  of  processor-bound 
systems  such  as  the  flow  and  job  shops  (Johnson,  1954; 

Garey,  Johnson  b  Sethi,  1976;  Gonzalez  &  Sanni,  1978;  Chin  & 
Tsai,  1978)  ano  the  open  shop  systems  (Gonzalez  &  Sahni, 
1976;  Gonzalez,  1976). 


In  this  chapter,  non-preemptive  schedules  for  the  case 
in  wnich  all  tasks  have  the  same  execution  time  (unit 
execution  time  or  UET)  is  considered.  It  is  shown  that  the 
problem  of  scheduling  such  systems  to  minimize  schedule 
length  is  NP-compie te  even  wnen  the  tasks  have  a  very  simple 
precedence  structure  consisting  of  chains.  Suosequentiy ,  a 
dynamic  programming  solution  is  presented. 
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3.1  Survey 

Goyal  (1977)  shows  that  the  problem  of  scheduling 
m-processor  bound  (JET  systems  to  minimize  schedule  length  is 
jSiP-^omplete  in  two  restricted  cases: 

(1)  m-processor  oound  (JET  systems,  arbitrary  m,  with  the 
precedence  constraints  restricted  to  being  a  forest, 

(2)  2-processor  bound  systems  with  arbitrary  precedence 
constraints. 

The  first  is  shown  by  a  reduction  from  the  NP-comple te 
problem  of  Eooe  Covering  (Garey  &  Jonnson,  1975;  Karp,  1972) 
and  the  second  oy  a  reduction  from  the  UET  scheduling 
problem  (Oilman,  1974) . 

Furthermore,  Goyal  presents  a  simple  level  algorithm, 
similar  to  that  of  hu  (1961),  which  produces  optimal 
schedules  it  the  precedence  graph  is  in  the  form  of  a  cyclic 
forest,  that  is,  one  in  whicn  all  tasks  in  the  same  level  of 
the  forest  require  the  same  processor,  but  for  any  two  tasks 
riv  and  Tv  in  two  adjacent  levels  h  and  n-1  respectively, 
their  processor  requirements  satisfy  the  relation 


R(T  )+l 

If 


1  <1  R(T  )  <  m 

R ( T  . )  =  m. 

3 


Tnus,  the  proolem  of  scheduling  m-processor  bound  UET 
systems  witn  a  specific  value  of  m  and  with  the  precedence 
relations  restricted  to  being  a  forest  is  left  open. 


Liu  ano  Liu  (1977)  and  Jaffe  (1978)  consider  a  slightly 
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more  general  model  in  wnich  there  are  q  types  ot  processors, 
with  iru  identical  processors  ot  type  i,  l£i£q,  ana  derive 
oounas  on  tne  lengtns  ot  list  schedules  in  terms  of  the 
optimal  schedule  lengtn. 

3.2  Definitions 


If  the  precedence  constraints  include  the  relation 
a  <  b  for  tasks  a  and  b,  then  a  is  an  immediate  predecessor 
of  b  and  b  is  an  immediate  successor  of  a.  Tne  predecessors 
of  o  are  all  those  tasks  which  must  oe  done  before  b  can  be 
executed  (ana  thus  includes  a  and  its  predecessors). 
Similarly,  the  successors  of  task  a  are  all  tasks  which  can 
oe  done  only  after  task  a  nas  been  executed  (and  thus 
induces  b  ana  its  successors)  .  Tne  precedence  graph 
consists  of  cnains  if  each  noae  has  at  most  one  immediate 
predecessor  and  at  most  one  immediate  successor.  It  is  a 
terminally  rooted  tree  if  eacn  node  nas  at  most  one 
immediate  successor  and  there  is  exactly  one  node,  the  root , 
which  nas  no  successor.  Similarly,  it  is  an  initially  rooted 
tree  if  eacn  noae  has  at  most  one  immediate  predecessor  and 
there  is  exactly  one  node,  the  root,  which  has  no 
predecessor,  ixioaes  whicn  have  no  predecessors  (successors) 
in  a  terminally  (initially)  rooted  tree  are  called  leaf 
nodes.  A  terminally  rooted  forest  consists  of  a  set  of 
terminally  rooted  trees.  Similarly,  an  initially  rooted 
forest  consists  ot  a  set  of  initially  rooted  trees. 
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An  m-processor  bound  UET  scheduling  problem  with 
precedence  constraints  restricted  to  k  chains  will  be 
relerred  to  as  a  k-chain  problem.  Similarly,  an  m-processor 
bound  UET  scneduling  problem  with  a  precedence  constraint 
whicn  is  a  terminally  rooteo  tree  will  be  referred  to  as  a 
tree  problem. 


3.3  Complexity  ot  the  k-Chain  Problem 

In  this  section,  it  is  shown  that  the  k-chain  problem 
for  arbitrary  k  is  NP-complete.  The  proof  is  by  reduction 
from  3 -PARTITION  (see  Section  1.2.2). 

Theorem  3.1:  The  k-chain  problem  for  arbitrary  k  is 
NP-comple te . 

Proof:  Given  an  instance  of  3-PARTITION  consider  the 
following  (3n+i)-cnain  problem.  The  chains  are  Q  ,  0£i£3n. 
The  3-th  task  of  the  i-th  chain  is  Q ^  L □ ]  .  Chain  Q0  has  2nK 
tasks  while  chain  l£i£3n,  nas  2a.^  tasks.  The  processor 

requirements  are 


2 

for 

(  2v-2)  K+l  £  j 

£  (2x-l)K 

R<20 

tlJ)  = 

1 

for 

(2x-l)K+l  i  3 

£  2xK,  1 

1 

for 

1  ^  j  £  a. 

R(Q, 

[3])  = 

-L 

± 

2 

for 

<  3  ^  2a^. 

The  deadline  is  D  =  2nR,  the  length  of  chain  Q0 . 
Consequently,  any  scneoule  that  meets  the  deadline  must 

chain  Q0  in  each  time  interval.  This  gives 


execute  a  task  of 
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the  template  of  Figure  15. 

Suppose  the  3 -PARTITION  problem  has  a  solution. 

Schedule  the  processor-1  tasks  of  those  chains  corresponding 
to  the  tnree  elements  of  the  i-th  partition  in  the  l-th  iale 
slot  (left  by  chain  Q  )  on  the  first  processor  ana  the 
remaining  processor-2  tasks  of  the  same  three  chains  on  the 
i-th  slot  on  the  second  processor. 

Conversely,  suppose  the  chain  problem  has  a  schedule 
which  finishes  by  time  D.  Then,  the  schedule  must  have  no 
idle  periods.  Chain  must  be  executed  as  shown  in  Figure 
15,  leaving  2n  idle  periods,  each  of  length  K,  alternating 
between  the  processors. 

It  is  easily  shown  by  inauction  on  the  number,  n,  of 
idle  penoas  on  eitner  processor  that  tor  chains  Qi ,  i  >  0, 
if  the  first  task  or  chain  Q.  is  done  in  time  slot 

l 

L ( 2x-2) K+l ,  (2x-l)Kj,  then  the  last  tasx  of  the  same  chain 

must  be  executed  in  time  slot  [(2x-l)K+l,  2xK]  ,  l£x<Ln. 
hence,  ail  the  processor-1  tasks  of  Q  are  done  in  one  or 
the  idle  periods  or  Figure  15  and  all  the  processor-2  tasks 
in  the  next  idle  period. 


In  general,  let  y  be  the  number  of  chains  Q . ,  i  >  0, 
whose  processor-1  tasks  are  done  in  time  slot 
[ (2x-2) K+l, (2x-l) K] ,  l^xln.  Then,  by  the  above  result,  the 
processor-2  tasks  of  the  same  cnains  are  done  in  time  slot 
[ ( 2x-l) K+l , 2xK] .  It  follows  that  the  total  number  of 
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processor-1  (or  2)  tasks  from  the  y  chains  is  K.  Since 
K/4  <  ».  <  K/2,  y  must  oe  precisely  three  for  any  x,  l^x^n. 

*  A 

hence,  the  3-PARTITION  problem  has  a  solution.  0 

3.4  Solution  of  the  2-Chain  Problem 

Let  the  tasks  in  one  of  the  two  chains  oe  a . ,  liiis. 

l 

ana  the  tasks  in  tne  other  chain,  b_^  ,  lij^t,  with  preceaence 

relations  a.,<=  <...<a  ana  b.  <b  < .  .  .  <o  ,  wnere  n  =  s+t  is 
lx  s  lx  t 

the  total  number  of  tasks.  There  are  m  2  2  processors. 

The  solution  is  by  aynamic  programming.  The  principle 
of  optimality  is  illustrated  in  the  following  argument.  In 
oraer  to  obtain  an  optimal  scheaule  one  can  proceea  as 
follows.  If  R(a^)  j-  R(b^)  then  a^  and  b^  can  be  executed  in 
tne  first  time  unit  without  loss  of  optimality.  This  is  then 
followed  by  an  optimal  scneaule  of  the  sub-chains  a„<...<a 

A  2 

ana  b^<...<ot.  It,  on  the  other  nana ,  R(a^)  =  R(b^)  then 

only  one  of  {a^,  b^}  can  be  executea  in  the  first  time 

interval.  In  one  case  a^  is  followea  by  an  optimal  scneaule 

of  the  sub-cnains  a_<...<a  ana  b_<...<b.  while  in  the 

2  ^  1  t 

otner  case,  b^  is  followea  oy  an  optimal  schedule  of  the 

sub-chains  a^<...<a  ana  b_<...<b..  For  the  optimal 
1  s  x  t 

solution,  try  both  ways  ana  pick  the  better  one. 

Let  F(i,j),  lii^s,  l£j£t,  oe  the  length  of  an  optimal 
schedule  for  the  sub-chains  a.<...<a  ana  d  <...<b  .  Then 

J.  S  J  VI 

the  above  aiscussion  implies  the  relation 


V 


F  (1,1) 


1  +  F  ( 2 , 2 )  ,  tor  R(a1)/R(D1) 


1  +  MIN{F  (2,1)  ,F(1 ,2)  )  ,  for  R(a  )=R(b  ). 

In  general,  assume  that  the  rirst  i-1  tasks  of  the  a-chain 

ana  the  first  j-1  tasks  of  the  o-chain  have  been  scneouied 

without  loss  of  optimality.  (A  number  of  tasks  is  saia  to 

have  oeen  scheduled  without  loss  of  optimality  in  the  first 

r  time  units,  r  >0,  if  there  exists  an  optimal  scneauie 

with  identical  execution  for  the  r  units.)  Then,  the  aoove 

argument  can  be  repeated  for  tasks  a^  ana  b_.  ana  the 

following  recursion  obtained. 

1  +  F (i+1 , j+1) ,  for  R(a.)^R(D  ) 

F  (  i  ,  D  )  =  3 

1  +  MINI  F  (  i  +  1  ,j  ),  F  (  i  ,  3+1)  }  ,  for  R  (a  ^ )  =R (b_.  )  . 

Since  cnain  b  <...<bt  is  not  defined  for  j  >  t,  let  F(i,t+1) 

denote  the  length  of  an  optimal  scheaule  (equal  to  length  of 

suo-chain)  of  the  suo-cham  a^<...<as,  for  i  £  s.  Similarly, 

let  F  ( s + 1 , 3  )  be  the  length  of  the  suo-cnain  b  <...<bt,  for 

3  £  t.  Thus,  F ( i , t+1)  =  s+l-i,  i  i  s ,  and  F(s+l,j)  =  t+l-j , 

3  £  t.  Taking  F(s  +  l,t+l)  to  be  zero,  every  element  of  the 

array  F(i,j),  l^i^s,  1^3<:t,  can  easily  be  computed.  To 

perform  the  computation  efficiently,  it  is  essential  to 

compute  F ( i+1 , 3 ) ,  F(i,j+1)  ana  F(i+1,3+1)  before  ever 

attempting  to  compute  or  use  the  value  of  F(i,j).  One  way  to 

achieve  this  is  to  compute  tne  elements  of  F(i,3)  in  reverse 

row  oraer  i.e.  last  row  first  and  for  each  row,  last  column 

first.  This  computation  is  presented  in  procedure  1. 

Theorem  3.2:  An  optimal  scneauie  for  the  2-chain 
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1 . 


procedure  CHAIN-ALG; 
oegin  comment  chains  are  a 


2 . 
3. 

4  . 

5  . 

6  . 

7. 
8  . 
9  . 

10. 


for 

for 

for 


!<• • *<as 


ana  D-^<  .  .  .  <fc>t . 


F(i,3)  is  lengtn  of  optimal  schedule  tor 
suo-cnains  a,C...<a  ana  D_<...<bJ_/  where  a 

L  u 


d  < . . . <bfc 


sub-chain  is  empty  if  i>s  or  g>t  respectively 
This  procedure  computes  the  array  F  (  i  ,  j )  in 
reverse  row  order; 

i  :=  1  until  s+1  do  F(i,t+1)  : =  s+l-i; 

3  :=  1  until  t  do  F(s+l,j)  :=  t  +  l-j  ; 
l  :=  s  step  -1  until  1  do 
tor  j  :=  t  step  -1  until  1  do 


begin  if  R(a^)  /  R(b.)  then 


end ; 


F(i,j)  : =  1  +  F(i+1,3+1)  else 
F ( i f  3  )  :=  1  +  MIN{F(i  +  l,3)  ,F(i,j+l)  }; 


end; 


PROCEDURE  1 


proolem  can  De  constructed  in  0(st)  time  ana  space  using  the 
procedure  CHAIN-ALG. 

Proof :  Suppose  the  array  F(i,3)  has  been  computed  with 
the  given  procedure.  Construct  a  schedule  having  minimal 
length  F(l,l)  Dy  tracing  the  computation  of  F(l,l). 

Suppose  tracing  is  currently  at  F(i,g).  If 

R(a.)  ¥■  R(b  )  schedule  a.  and  b  in  the  next  time  unit  ana 
13  1  3 

trace  F(i+1,3+1).  If  R(a^)  =  R(b^)  then  scneaule  a^  and 
trace  F(i+1,3)  if  F(i,j)  =  1  +  F(i+1,3);  otherwise  schedule 
b^  and  trace  F(i,3+1).  It  is  clear  from  the  foregoing 
discussion  that  this  will  result  in  a  schedule  with  length 
F(l,l).  This  process  taKes  G(F(1,1))  =  0(s+t)  time. 

Now,  the  array  F  ( i ,  j  )  ,  liiis+1,  lijit+1,  has  0(st) 


' 
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elements  and  tne  computation  of  each  element  (see 
Procedure  1)  is  done  within  a  constant  number  of  steps. 
Hence,  at  most  0(st)  time  is  used  to  find  an  optimal 
schedule.  Tne  G(st)  space  requirement  is  obvious.  0 


Example 

3.1:  Le  t  m 

= 

3  t 

s 

= 

6  , 

t  = 

7  / 

ai 

=  i,  l£i£6, 

b 

3 

= 

3  / 

l*j£7. 

(R(a  )  ,  UU6) 

- 

(1 

i 

If 

3 , 

3, 

2 ,  2)  , 

(R(b  ) ,  l£j£7) 

= 

(1 

f 

3 , 

3, 

If 

2 ,  2,  1)  . 

Then  the  elements  of  array 

F  ( i  f 

j) 

are  as  follows: 

index 

i 

l 

t  1 

1 

12  3 

4 

5 

6 

7 

8  <-  index  j 

1 

1  I 

8  7  7 

7 

6 

6 

6 

6 

2  I 

8  7  6 

6 

5 

5 

5 

5 

3  1 

8  7  6 

5 

4 

4 

4 

4 

4  I 

7  7  6 

5 

4 

3 

3 

3 

5  1 

7  6  5 

4 

4 

3 

2 

2 

6  i 

7  6  5 

4 

3 

2 

1 

1 

7  1 

7  6  5 

4 

3 

2 

1 

6 

The  minimal 

length  is  F(l, 

1) 

= 

8 

units. 

One  of  the  possible 

tracings  is 

— >  F  ( 1 , 1)  — 

>F(1, 

2) 

->F(2, 

3)  — >F (3,4) 

— >F(4,5)  — 

>F  ( 5 , 6)  — >F  (  6 

!  6) 

-  >  F  ( 6 

,7) 

.  The  schedule  wnich 

is  generated  in  conjunction  with  the  above  tracing  is  given 
in  figure  16.  0 

An  alternative  way  to  recover  an  optimal  schedule  from 
the  F(i,j)  array  is  to  keep  another  array  H ( i , j )  of  pointers 
indicating  which  of  F(i+l,j),  F(i,j+1),  and  F(i  +  l,j  +  l)  was 
used  to  obtain  F'(i,j)  This  simplifies  the  subsequent  task 
of  constructing  an  optimal  schedule  after  computation  of 


. 
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F(l,l) . 

The  aoove  algorithm  is  strikingly  similar  to  the 
dynamic  programming  solution  ot  the  longest  common 
subsequence  (LCS)  problem  (Hirschberg,  1975;  Brown,  1978). 

In  fact,  for  a  2-chain  problem  on  two  processors,  if  the 
processor  requirements  of  one  of  the  cnains  are  reversed 
(tnat  is,  processor-!  tasks  become  processor-2  tasks  and 
vice  versa)  a  solution  of  the  LCS  problem  for  the  resulting 
cnains  corresponds  to  a  solution  of  the  original  scheduling 
problem.  Thus,  the  2-cnain  problem  on  two  processors  is 
equivalent  to  an  LCS  problem  with  a  2-symbol  alphabet. 
Considerable  work  nas  been  done  on  the  LCS  problem  (see 
Brown  tor  further  reterences)  wnich  may  be  applicable  to  the 
2-chain  problem,  unfortunately,  the  analogy  breaks  down  tor 
the  k-cnain  problem  on  m  processors  when  k  >  2  or  m  >  2. 

3.5  Extension  to  More  Complex  Precedence  Graphs 

In  this  section,  the  extension  of  the  dynamic 
programming  solution  to  k-chains,  trees,  and  arbitrary 
precedence  digraphs  is  considered.  The  extension  to  k 
cnains,  k  >  2,  is  straightforward,  the  two-dimensional 
array,  F,  of  the  previous  section  being  replaced  by  a 
k-dimensionai  array.  This  leads  to  an  0(n  )  time  and  space 
aigoritnm,  assuming  eacn  cnain  to  be  ot  length  n.  Now 
consider  the  tree  problem  for  m  =  2.  (The  extension  ot  the 
solution  to  the  case  ot  more  than  two  processors  will  be 
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straightforward)  . 

For  an  arbitrary  precedence  graph  (a  directed  acyclic 
graph)  a  subgraph,  G,  will  be  called  a  terminal  subgraph  if 
it  satisfies  the  condition  that  if  a  node,  u,  is  in  G  then 
every  successor  of  u  is  also  in  G.  A  terminal  subtree  is  a 
terminal  subgraph  which  is  also  a  tree.  The  set  of  nodes  in 
a  terminal  subgraph  forms  a  terminal  subset  of  the  set  of 
nooes  in  the  original  graph. 


F iow ,  consider  the  tree  problem.  Given  a  schedule  for 
tne  tree  problem,  for  any  integer,  t,  (less  than  or  equal  to 
tne  length  of  tne  scneoule)  the  tasks  in  tne  final  t  time 
units  of  the  scneoule  form  a  terminal  subset  of  the  set  of 
all  tasks  in  the  system.  The  principle  of  optimality  applies 
in  the  same  manner  as  for  tne  2-chain  problem.  Suppose  that 
at  tne  end  of  the  (i-l)-th  time  unit,  a  number  of  tasks  have 
oeen  scheduled  without  loss  of  optimality,  leaving  a 
terminal  subtree,  G.  Tne  only  nodes  that  can  be  scheduled 
for  execution  in  the  i-th  time  unit  are  leaf  nodes  of  G. 
Consider  ail  possible  pairings  of  a  processor-1  leaf,  g  ,  of 
G  with  a  processor-2  leaf,  g  ,  of  G  for  execution  in  time 
unit  i.  From  each  such  pairing  followed  by  an  optimal 
schedule  of  the  corresponding  reduced  terminal  subtree,  pick 
tne  one  which  gives  the  shortest  length  schedule.  This  leads 
to  tne  recursion,  f  (G)  =  M I N  { 1  +  f(G  -  g^  -  g  ^ )  }  /■  wnere  f  (G) 
is  the  length  of  an  optimal  schedule  tor  subtree  G  and  the 
minimum  is  taken  over  all  leaves  g  of  G  whicn  require  the 


. 
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first  processor  ana  all  leaves  of  G  which  require  the 
secona  processor.  Of  course,  G  may  nave  no  processor-1  leaf 
or  processor-2  leaf  in  which  case  (G  -  g  -  g ^ )  in  the 
expression  above  is  repiacea  by  (G  -  g^)  or  (G  -  g^) .  In 
oraer  to  compute  t (G)  efficiently  for  all  terminal  subtrees, 
G,  tne  following  is  required: 

(a)  A  method  of  indexing  or  assigning  aadresses  to  subtrees 
G  so  that  the  locations  of  (G  -  g^  -  g  ) ,  (G  -  g  )  and 
(G  -  g.)  can  be  reterenced  quickly  given  G,  g,  ana  g., 

^  1  A 

(b)  A  simple  method  of  enumerating  the  terminal  subtrees  G 
(or  equivalently,  enumerating  the  terminal  subsets) 
sucn  that  (G  -  g^  -  g2) >  (G  -  g^)  ana  (G  -  g.)  are 
enumeratea  berore  G. 

The  first  problem  can  be  solvea  by  assigning  labels  to 
tne  tasks  so  that  the  inaex  of  a  tree  is  the  sum  of  the 
labels  of  the  tasks  in  tnat  tree. 


As  for  tne  secona  problem,  although  several  algorithms 
are  available  (Nijennuis  &  wilf ,  1978)  for  enumerating 
subsets  of  a  set  under  varying  conaitions,  none  of  them  can 
De  used  to  enumerate  terminal  subsets  exclusively.  In  the 
following,  an  enumeration  algorithm  is  presented  wnicii 
enumerates  only  the  terminal  subsets  of  a  given  tree.  This 
reauces  storage  space  for  indexing  and  eliminates  the  need 
to  cneck  that  a  subset  is  terminal  or  alternatively  the  need 
to  derive  t (G)  for  a  subtree  G  that  would  never  suosequently 


be  referenced. 


. 
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The  terminal  subset  enumeration  scheme  consists  of  two 
parts,  a  labelling  procedure,  LAEELTREE,  which  is  presented 
in  Procedure  2  and  a  decoding  procedure,  DECODE,  given  in 
Procedure  3.  LAEELTREE  assigns  labels  to  the  nodes  of  a  tree 
so  that  eacn  terminal  subset  may  be  indexed  by  the  sum  of 
tne  labels  of  the  nodes  in  the  subset.  It  will  be  shown  that 
the  terminal  subsets  have  indices  from  1  to  I ,  where  1  is 
the  sum  of  all  the  labels.  DECODE(j)  constructs  the  terminal 
subset  wnose  index  is  j  .  Thus  in  order  to  enumerate  the 
terminal  subsets  it  is  sufficient  to  DECODE ( j )  for  l^j^I. 


The  laoellmg  procedure  groups  the  tasks  into  N  chains 
ano  assigns  the  same  label  to  all  tasks  on  tne  same  chain. 
Assume  that  a  dummy  final  node  (successor  to  the  root)  with 
label  0  is  temporarily  aodeo  to  the  tree  and  that  the 
predecessors  of  each  node  have  been  arbitrarily  ordered. 
Then,  the  iterative  step  proceeds  as  follows.  Suppose  the 
first  (l-l)  chains  have  been  defined  and  labelled.  Then, 
define  the  l-th  chain  in  the  following  manner: 

(1)  Find  the  most  recently  labelled  node  with  an  unlabelled 
predecessor.  If  none  exists,  stop  (all  tasks  have  been 
labelled) .  Add  the  next  unlabelled  predecessor  of  tne 
node  to  the  new  chain.  (It  is  the  iast  task  on  the 
chain) . 

(2)  Let  v  be  trie  most  recent  nooe  aooeo  to  the  chain.  If  v 
has  any  predecessors  (none  of  them  has  a  label)  add  the 
first  predecessor  of  v  to  the  chain  and  go  to  2; 


. 
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procedure  LABELTREE; 

begin  comment  N  is  the  number  of  chains. 

I  is  the  index  of  the  complete  tree; 

1.  i  :  =  a1  ;=  1;  x  :  =  t  :=  n^^  :  =  :=  0; 

2.  TRAVERSE ( root  of  tree,  x)  ; 

3  .  N  :  =  i  ;  I  ;  =  t ; 

end . 

Procedure  TRAVERSE ( node ,  x)  ; 
begin  local  integer  k; 

1.  if  null  node  return; 
comment  visit  node; 

2.  laoel  node  with  a.;  add  node  to  cnain; 

l 

x  :=x+a.;  t  ; =  t  +  a . ;  n.  :=n.  +1; 

l  ill 

comment  traverse  predecessors  in  pre-order; 

3.  TRAVERSE ( f irst  predecessor  of  nooe ,  x); 

k  :=  (number  of  predecessors  of  nooe)  -  1; 

4.  wnile  k  >  0  do 

begin  comment  start  new  chain; 

i  :  =  i  +  1 ;  n .  :=0;s.  :  =  x ; 

l  '  i 

a  .  :  =  t  +  1  -  x ; 

l  ' 

TRAVERSE (next  predecessor  of  node,  x) ; 
k  :  =  k  -  1 ; 
end ; 

end ; 


PROCEDURE  2 


otherwise  stop  (chain  i  is  complete) . 

This  cnain  becomes  the  i-th  chain.  The  label,  a^,  assigned 
to  every  tasK  on  this  cnain  is  one  larger  tnan  the  sum  of 
labels  of  all  those  labelled  tasks  that  are  not  successors 
of  the  last  task  on  the  cnain.  The  sum,  s^,  of  iabels  of  ail 
labelled  tasks  that  are  successors  of  the  last  task  on  the 
chain  is  also  saved  for  later  use  in  tne  decoding  process, 
with  this  labelling,  the  inoex  of  a  set  will  be  the  sum  of 


tne  labels  of  tasks  in  that  set  thus  satisfying  condition 
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(a)  above  for  erficient  computation  of  f (G) . 

Tne  procedure,  as  given,  implements  the  aoove  ideas 
using  a  preorder  traversal  (Aho,  Hopcroft  &  Oilman,  ly74)  of 
the  tree.  The  procedure  keeps  track  of  the  sum,  x,  of  the 
labels  of  all  labelled  tasks  that  are  successors  of  the  task 
to  be  visited  and  the  total,  t,  of  labels  of  all  labelled 
nodes.  These  are  used  in  computing  a^  ano  s^  when  a  new 
chain  is  defined. 

Given  an  index,  j,  the  decoding  procedure  determines 
the  nodes  of  a  unique  subtree  G  of  the  original  set  of 
nodes.  It  uses  the  following  data  set  up  by  the  labelling 
procedure : 

(1)  the  numoer  ot  chains,  N, 

(2)  the  number  of  tasks,  n  ,  on  cnain  i, 

(3)  the  actual  tasks  on  eacn  chain  in  order, 

(4)  the  iaoel,  a^,  of  tasks  on  cnain  i, 

(5)  tne  sum  ot  labels,  s^,  defined  above. 

The  procedure  considers  the  chains  in  decreasing  order  of 
their  index  i.e.  in  the  reverse  order  from  that  in  which 
they  were  defined.  Suppose  chains  N ,N- 1 , . . . , i+1  have  been 
considered  i.e.  the  tasks  from  tnese  chains  have  oeen 
determined  and  their  labels  have  been  subtracted  from  j 
giving  current  suoset  index  c.  Then  one  of  the  following 
cases  applies. 

(1)  c  <  s,  +  ai.  Then  k  =  0  tasks  of  chain  i  belong  to  the 


subset . 
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procedure  DECODE ( 3 ) ; 

oegin  comment  N  is  the  number  of  chains. 

The  arrays  a.,  s.,  and  n.,  l£i£N,  are  as  defined 
11  1 

in  Procedure  2; 

D ;  SUBSET  :=  null; 
i  :=  N  step  -1  until  1  ao 
begin  comment  find  elements  of  cnain  i  belonging 
to  the  subset; 

if  c  <  l  +  a!  then  k  :=  0  else 


if  c  2  s .  + 
1 

n . 
1 

a .  then  k 
1 

•  : 
• 

=  n^  else 

f  ino 

such  that 

s  .  +  ka . 
1  1 

c  <  s .  + 

1 

( k+1) 

add  last  k  tasks 

of 

chain  i 

to 

SUBSET; 

c  :  =  c  -  ka .  ; 
end ; 

4.  return  SUBSET; 
eno . 


1 .  c  :  = 

2 .  for 
3  . 


PROCEDURE  3 


(2)  c  2  si  +  n]^a1.  Then  k  =  n^  tasks  of  chain  i  belong  to 
the  subset. 

(3)  s  +  ka.  1  c  <  s  +  (k+l)ai-  (i.e.  k=|_(c  -  si)/aiJ.) 
Then  last  k  tasks  of  chain  i  belong  to  the  subset. 

In  any  case,  c  is  reduced  by  ka^  and  the  procedure  considers 
chain  i-1. 

The  labelling  and  subsequent  decoding  of  a  particular 
inoex  j  is  illustrated  by  the  following  example. 
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Example  3.2: 

LABELTREE :  See  Figure  17  for  tree  structure.  The 
ordering  of  predecessors  to  a  node  is  inaicatea  in  the 
figure  Dy  integers  near  the  edges  leading  into  each  node. 


|  i 

chain  i 

n  . 
i 

a  . 

i 

s  . 

l 

1 

r  ,  d  ,a 

3 

1 

0 

1  2 

z 

1 

2 

2  I 

1  3 

e ,  b 

2 

4 

2  1 

4 

n ,  f 

2 

13 

1 

5 

9 

1 

26 

14 

DECODE  (17) : 

c  =  17;  SUBSET  is  empty 

i=5:  c  <  s_  +  ac.  (no  task  from  chain  5) 
d  d 

i=4:  s^  +  a^  £  c  <  s^  +  2a^ . 

(take  last  task  of  chain  4) 

SUBSET  =  {h}.  c  =  c  -  a4  =  4. 
i=3:  c  <  s-  +  a_.  (no  task  from  chain  3) 

•J 

i  =  2 :  c  1  s„  +  n„a„.  (take  all  chain  2  tasks) 
z  2  2 

SUBSET  =  { h , z } ;  c  =  c  -  ^2a2  = 
i  =  l  :  s^  +  2a^  £  c  <  s-^  +  3a^. 

(take  last  two  tasks  of  chain  1) 

SUBSET  =  {h,z,ci,r};  c  =  c  -  2a^  =  0.  D 

Tne  following  properties  of  LABELTREE  ana  DECODE  are 
requirea  for  proving  that  the  terminal  subsets  are  inaeeci 
given  by  DECODER)  for  l£j£I. 


. 
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Lemma  3,1:  with  the  laoelling  of  procedure  LABELTREE 
every  terminal  subset  has  a  unique  index  j,  i£jZl. 

Proof :  The  proof  is  by  contr ao ict ion .  Note  that  the 

index  of  a  set  is  the  sum  of  the  labels  of  its  elements. 

Since  I  is  oy  ueiinition  the  sum  of  all  the  labels,  any  set 

must  have  inaex  in  the  given  range.  Suppose  two  terminal 

subsets  U  and  V  have  the  same  index  j.  Let  node  u,  with 

laoel  a^,  be  the  highest  labelled  node  which  is  in  one 

subset  but  not  in  the  other.  Assume  that  node  u  is  in  0  but 

not  in  V.  Then  ail  labels  in  V  wnich  are  less  tnan  a.  were 

i 

assigned  oy  LABELTREE  before  a^.  By  definition  of  a^  in  tne 
procedure,  a^  is  larger  tnan  the  sum  of  all  smaller  labels 
in  tne  tree  which  are  not  labels  of  successors  of  nooe  u. 
Tnus,  the  following  situation  results: 

(1)  Every  node  in  V  with  label  larger  than  a^  is  in  U. 

(2)  Node  u,  with  label  a^,  is  in  U  but  not  in  V. 

(3)  Every  successor  of  u  in  V  is  also  in  U  since  0  is  a 
terminal  subset  and  must  nave  all  successors  of  node  u 

(4)  The  sum  of  labels  of  nodes  in  V  whose  labels  are  less 
than  ax  and  which  are  not  successors  or  u  is  strictly 
less  tnan  a^. 

hence,  set  U  has  a  higner  inoex  than  set  V,  contradiction.  | 


Lemma  3.2:  The  set  0  =  DECODb(j)  identified  by  any 
index  j  is  unique  tor  l£j£I. 


' 
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Proof ;  Consider  the  way  in  which  3  is  decoded.  In  the 

general  step  the  tasks  from  chains  N ,  N-l ,  ...  i+1  have  been 

determined  and  their  labels  subtracted  from  j  giving  an 

index  value  c  for  the  remaining  elements.  Now,  suppose  that 

no  tasK  of  cnain  i  is  in  the  set  J.  Then,  J  can  contain 

(among  the  remaining  chains)  at  most  all  tasks  with  label 

less  than  a ^ .  Tne  labelling  procedure  guarantees  that  trie 

sum  of  labels  of  these  tasks  is  at  most  s.^  +  ^.  -  1.  Hence, 

c  <  +  a_^.  Similarly,  suppose  that  k  tasks  of  chain  i 

belong  to  set  J.  Then  J  has  sum  of  laoels  at  least  s.  +  •'a. 

1  1 

for  the  k  tasks  ano  the  successors  or  the  last  task  of  chain 
i  which  must  ail  be  in  the  set.  In  audition,  J  may  contain 

all  tasks  with  label  less  than  a.  which  are  not  successors 

1 

of  the  last  task  of  chain  i.  Since  this  latter  set  of  tasks 

has  total  labels  at  most  a.-l,  s.+ka.  £  c  <  s+(k+i)a..  In 

111  11 

the  case  tnat  all  n^  tasks  of  chain  i  are  in  the  set,  the 
inequality  s.^^  +  niai  £  c  is  similarly  obtained. 

Since  only  one  of  tne  tnree  cases 

(1)  c  >  «i  +  ai , 

(2)  s^  +  ka±  £  c  <  s^  +  (k+l)a^ 

(3)  c  2  s.  +n.a. 

1  11 


can 

occur  for 

a  given  c,  the  value  of  c 

uniquely 

identifies 

the 

numoer  of 

tasKS  from  chain  i.  Thus, 

DECODE (3 ) 

produces  a 

uniq 

ue  subset 

for  index  j . 

0 

It  is  also  necessary  to  show  that  I  is  actually  the 
number  of  non-empty  terminal  subsets  or,  in  other  words. 


78 


that  there  is  no  3,  lsljil,  such  that  DECODE  ( 3 )  yields  a 
non-terminal  subset.  This  is  cone  by  showing  that  auring  the 
decoding  process,  if  node  w  is  included  in  the  subset  and 
node  u  is  a  successor  of  node  w  then  u  must  suosequently  be 
included  in  the  suoset.  The  following  property  of  the 
iaoeliing  is  required  tor  this  proof. 

Let  node  u  Delong  to  the  Q(u)-th  chain. 

Lemma  3 :  It  nooe  u  is  a  successor  of  node  w  and 
Q(u)  <  Q(v)  <  Q ( w ) ,  then  node  u  is  a  successor  ot  nooe  v. 


Proof ;  Refer  to  procedure  LABELTREE,  Procedure  2.  Since 
node  u  is  a  successor  of  nooe  w,  in  the  preorder  traversal 
of  the  tree  the  call  to  the  procedure  to  traverse  w  is  made 
ano  completed  while  the  call  to  traverse  u  is  suspended. 
Also,  since  Q(u)  <  Q(v)  <  Q(w)  and  the  chains  are  defined  in 
increasing  order  of  their  index,  the  call  to  traverse  node  v 
must  have  oeen  made  after  the  call  to  traverse  u  ano  before 
the  call  to  traverse  w.  Consequently,  node  v  is  also  visited 
and  tne  traversal  of  v  completed  while  the  traversal  of  u  is 
suspended . 

Bence,  node  u  is  the  root  of  a  subtree  containing  both 
v  and  w.  Tnerefore,  nooe  u  is  a  successor  ot  nooe  v.  fl 


Lemma  3.4;  Consider  a  tree  labelled  with  procedure 
LABELTREE.  During  the  decoding  of  an  index,  3 ,  of  a  subset, 
if  node  u  is  a  successor  of  node  w  and  w  is  included  in  the 


■ 
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subset,  then  u  must  subsequently  be  included  in  the  subset. 


Pr oot :  Refer  to  procedure  DECODE,  Procedure  3.  The 
proof  is  Dy  induction  on  the  number  of  chains.  Note  that  tor 
any  i,  l£i£N,  the  first  i  chains  constitute  a  terminal 
subtree  and  the  labels  of  the  subset  is  a  laoeiling  for  the 
subtree . 


The  theorem  is  trivial  ror  the  first 
Suppose  it  is  true  for  the  tree  formed  Dy 
cnains,  l<i<N.  During  the  decoding,  if  no 
are  selected,  then  the  theorem  applies  to 
chains . 


or  only  chain, 
the  first  ( i —  1 ) 
tasks  of  chain  i 
the  first  i 


Suppose  a  task,  w,  of  cnain  i  is  selected  and  u  is  a 
successor  of  w.  If  Q(u)  =  Q(w)  =  i,  then  u  is  also  selected 
at  the  same  time  with  w.  If,  on  the  other  nano,  u  Delongs  to 
a  smaller  numDered  chain  (all  higher  numbered  chains  have 
been  dealt  witn  at  this  time) ,  then  there  are  two  cases: 


(1)  There  exists  some  task  v  with  Q(u)  <  Q(v)  <  Q(w) 
such  that  v  is  suosequentiy  selected.  By  Lemma  3 ,  u  is  also 
a  successor  of  v  and  by  the  induction  hypothesis  for  the 
first  Q(v)  chains,  u  must  De  suosequentiy  selected. 

(2)  now  suppose  that  there  is  no  tasK  v  with 

Q(u)  <  k(v)  <  Q(w)  which  is  included  in  the  suDset.  Let 
Q(u)  =  j .  Thus,  after  selecting  a  number  of  tasks  including 
w  from  chain  i,  no  other  tasks  are  selected  until  chain  j, 
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3  <  '  is  under  consideration.  Let  c  be  tne  resuiting  subset 

inoex  after  chains  N ,N-1 , . . . , i+1  nave  been  consioereo  and 

let  Kj,  be  tne  number  of  tasKs  from  chain  i  wnich  are 

selected.  Tnen,  c  ±  si  +  k^^  This  inequality  still  holds 

at  the  time  cnain  j  is  being  considered,  if  k  tasks  of 

3 

chain  j  (including  task  u)  are  successors  of  node  w  ana 

hence  of  the  last  task  on  chain  i,  then  c  -  k.a.  1  s. 

111 

=  s  +  k.a..  Thus,  the  condition  for  selecting  at  least  k, 

3  3  3  3 

tasks  from  chain  3  is  satisfied.  Therefore,  node  u  must  be 
selected . 

In  any  case,  the  theorem  now  applies  to  the  first  i 
chains.  By  induction,  it  applies  to  all  IM  chains.  fl 

Tneorem  3 . 3 :  For  a  tree  labelled  with  LABELTREE,  the 
terminal  subsets  are  given  by  DECODE {3 ) ,  l£j£I. 

Proof:  By  Lemma  3.1  every  terminal  suoset  has  a  unique 
index  between  1  and  I.  By  Lemmas  3.2  and  3.4,  DECODER), 
lijil,  yields  a  unique  terminal  suoset.  Thus,  there  is  a 
one-to-one  correspondence  between  the  first  I  integers  and 
the  non-empty  terminal  subsets  of  the  given  tree.  fl 

It  is  easy  to  check  for  Example  3.2  that  the  number 
I  =  65  obtained  is  indeed  the  number  of  non-empty  terminal 
subsets  of  the  tree.  The  numoer  of  terminal  suDsets  of  a 
tree  satisfies  the  following  recurrence  relation.  Let  h(u) 
be  the  number  of  non-empty  terminal  subsets  of  the  tree 
rooted  at  node  u.  Since  a  terminal  subtree  of  a  tree  rooted 
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at  nooe  u  consists  or  u  ana  some  terminal  subtrees  rooted  at 
preaecessors  of  a,  h(u)  =  1  +  TTT{n(v)  }  ,  where  v  ranges  over 
all  immeaiate  preaecessors  of  u.  Applying  this  recurrence 
relation  to  the  tree  of  Example  3.2  yields  h(r)  =  65. 


The  main  part  of  the  solution  of  the  tree  problem  is 
outlined  in  Procedure  4.  Here,  L(g)  is  the  label  given  to  a 
node,  g,  ana  F(j)  is  the  length  of  an  optimal  schedule  for 
the  subtree  wnose  inaex  is  3.  The  procedure  outline  follows 
closely  the  previous  aiscussion  ana  requires  no  further 
comments. 


In  oraer  to  construct  the  actual  schedule  another  array 
of  I  pointers  snoula  be  maintainea,  inaicating  for  each 
inaex,  j,  wnich  suotree  was  used  to  obtain  F(j).  With  this 
array  of  pointers  ana  the  DECODE  proceaure,  suDsequent 
construction  of  an  optimal  scneaule  is  straight! orward . 

Now,  consiaer  the  time  ana  space  requirements  of 

TREE-ALG.  It  is  easily  seen  that  both  LABELTREE  and  DECODE 

each  take  0(n)  time  where  n  is  the  number  of  nodes.  For  any 

terminal  subtree,  G,  the  number  of  pairs,  (g-^g^)  of 

processor-1  and  processor-2  leaves  is  no  more  than  nz. 

2 

Hence,  TREE-ALG  requires  at  most  0(n  I)  time. 

As  for  storage,  note  that  since  a  tree  of  n  noaes  has 
n-1  eages,  the  storage  of  a  tree  structure  requires  only 
G(n)  storage  locations.  Similarly,  LABELTREE  and  DECODE  also 
require  linear  storage.  During  the  terminal  subtree 
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procedure  TREE-ALG ; 

Degin  comment  L(g)  is  the  la  Del  ot  noae  g. 

I  is  the  index  of  the  complete  tree. 

F(j)  is  the  length  ot  an  optimal  schedule  for 
subtree  with  index  3  ; 

LABELTREE;  (note  I  is  computed  by  LABtLTREE) 
for  3  :=  1  until  I  do 

begin  G  :=  DECODER); 

determine  processor-1  leaves  of  G; 

determine  processor-2  leaves  of  G; 

F(j)  :=  min{  F  ( j  -  Ltg^)  -  L(g2))j, 

wnere  the  minimum  is  taken  over  all  pairs  of 
processor-1  leaves,  g  ,  and  processor-2 
leaves,  g2,  of  G; 

end ; 

end ; 


PROCEDURE  4 

2 

enumeration,  0(n  )  locations  are  needed  for  the  pairs, 

(Q 1 r 9 2 ^  Howeverf  tne  same  locations  can  be  used  for  every 

subtree  since  it  is  only  necessary  to  save  F ( □ ) .  Bence, 

2 

storage  requirement  is  at  most  of  o(n  +  1) . 


Thus,  tne  complexity  of  the  algor ltnm  is  really 

determined  by  the  numoer  ot  subsets  generated.  Aithougn  the 

number  of  subsets  is  exponential  in  n,  in  general,  the 

presence  of  precedence  constraints  significantly  reduces  the 

number  of  terminal  subsets.  In  Example  3.2,  only  65  subsets 

9 

would  be  generated  as  compared  to  2  -1  =  511  non-empty 
subsets  of  the  nine  tasks. 


Finally,  consider  the  other  types  ot  precedence  graphs. 
Solutions  for  initially  rooted  trees  may  be  obtained  oy 
initially  reversing  all  the  directions,  applying  the 
terminally  rooted  tree  method  and  finally  reversing  the 


. 
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resulting  scneauie .  Similarly,  solutions  for  forests  may  be 
obtained  oy  adding  a  dummy  root  which  is  connected  to  the 
roots  or  all  tne  trees,  applying  the  corresponding  tree 
algorithm  and  finally  deleting  the  dummy  root  from  the 
schedule.  As  tor  the  general  acyclic  digraph,  it  is  obvious 
tnat  a  good  dynamic  programming  solution  along  the  lines 
discussed  above  hinges  on  the  availability  of  a  good 
terminal  suoset  enumerator.  Recently,  Acnugbue  and  Chin 
(1980) ,  have  developed  a  terminal  subset  enumerator  for 
arbitrary  precedence  graphs  which  takes  at  most  G(el)  time 
and  0(e)  space,  wnere  e  is  the  number  of  edges. 

3.6  Discussion 

In  this  chapter,  the  iSiP-completeness  of  the  minimal 
length  scneduling  problem  for  UET  processor-bound  systems 
introduced  oy  Goyal  (1977)  tor  the  case  of  two  processors 
anu  an  arbitrary  numoer  of  chains  has  been  demonstrated. 

This  appears  to  oe  the  simplest  case  of  the  processor-bound 
systems  that  is  Rp-r-omple  te  ano  indeed  the  result  subsumes 
all  NP-completeness  results  of  Goyal  as  well  as  the  case  of 
a  fixed  number  of  processors  m  2  2  and  precedence 
constraints  in  the  form  of  trees,  which  he  left  as  an  open 
problem . 

In  addition,  a  dynamic  programming  approach  is  proposed 
for  finding  minimal  length  schedules  for  these  systems  which 
represents  a  significant  improvement  over  simple 
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enumeration.  This  approach  requires  a  terminal  subset 
enumeration  scheme.  In  the  case  ot  trees  (ana  forests),  a 
terminal  subset  enumeration  algorithm  is  presentee.  It  is 
efficient  m  tne  sense  that  it  never  enumerates  a 
non-terminal  subset.  For  general  precedence  graphs,  the 
scheme  of  Baker  ana  Schrage  (1978)  is  gooa  in  that  it  is 
fast  but  it  however  includes  some  non-terminal  subsets  in 
its  enumeration.  The  recent  algorithm  of  Achugbue  and  Chin 
(1980)  is  reasonaoly  fast  and  never  enumerates  non-terminal 
subsets . 


Chapter  Four 


FLOW  SHOP  SCHEDULES 

In  the  flow  shop  model,  considered  in  this  chapter,  and 

the  open  shop  of  the  following  chapter,  several  related 

tasks  are  grouped  together  to  form  a  job.  Thus,  a  flow  shop 

consists  of  m  processors,  P^  ,  l£j£m,  and  n  jobs,  J,.  ,  l£i£n, 

1 

where  job  contains  m  tasks  or  stages,  T^[j],  l£j£m.  The 
flow  shop  is  processor  bound  since  task  T^[j]  must  be 
executed  on  the  j-th  processor,  and  it  is  characterized  by  a 
unidirectional  flow  of  tasks,  that  is,  task  T^[j]  must  be 
executed  oefore  T^Lj+lj  for  any  i. 

When  dealing  with  two-  or  three-stage  shops,  it  is 
often  more  convenient  to  refer  to  the  component  tasks  of  job 
as  tasks  A^,  b^  and  C^  (to  be  executed  on  processors  p^ , 
P9  and  P,  respectively)  . 

The  flow  shop  is  perhaps  the  first  of  the  processor 
bound  models  to  be  studied  extensively  prooably  due  to  the 
fact  that  it  closely  simulates  assembly  line  systems.  Much 
of  what  is  Known  about  the  model  is  reported  in  Conway, 
Maxwell  and  Miller's  Dook  (1965)  and  the  more  recent  text 
by  Baker  (1974) .  The  emphasis  in  this  chapter  is  on 
non-preemptive  schedules  minimizing  schedule  length  for 
three-stage  systems. 
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4.1  Survey 

Johnson  (1954)  showed  that  tor  two  and  three-processor 
flow  shop  minimal  length  non-nrempt ive  scheduling  problems 
it  is  sutncient  to  consider  only  permutation  schedules,  in 
wnicn  tne  30bs  are  scneduied  in  the  same  oroer  on  ail  the 
processors  and  a  processor  is  not  Kept  idle  if  a  tasK  tor 
tnat  processor  is  ready  tor  execution  (thus,  permutation 
schedules  are  keep-ousy  schedules) .  He  furtner  gave  the 
well-known  C(niogn)  solution  for  the  two-processor  system. 
The  rule  tor  obtaining  an  optimal  schedule  in  a  two-stage 
flow  shop  is  to  schedule  the  i-th  30b  before  the  3-th  if 
MIN(AifB  )  £  MIN(Aj,Bi). 

For  three-stage  tiow  shops,  solutions  have  been 
presented  for  several  special  cases.  These  are  itemized 
oelow  by  original  author: 

(1)  Jonnson  (1954)  extended  his  two-stage  rule  to  the  case 
of  three  processors  and  showed  that  the  extended 
version  produces  minimal  length  schedules,  also  in 
O(niogn)  time  whenever  the  task  system  satisfies  either 
A^B  tor  all  1  and  3  ,  or  for  all  i  and  3  . 

He  further  con3ectured  that  when  his  two-stage 
rule  applied  to  the  first  two  stages  yields  the  same 
permutation  as  tnat  for  the  last  two  stages,  then  the 
permutation  is  optimal  for  the  three  stage  problem. 
However,  Burns  and  Rooker  (1976)  showed  that  this  is 
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not  always  the  case.  Johnson's  conjecture  is  true  if 
for  each  application  of  the  two  stage  rule,  the 
inequality  is  strict  for  all  job  pairs,  or  if  the 
permutation  resulting  from  the  first  two  applications 

is  also  optimal  for  the  first  and  third  processors. 

2 

(2)  Arthanari  and  Mukhopadhyay  (1971)  gave  an  0(n  ) 

algontnm  for  systems  with  either  A^IB^  for  all  i,j  or 

C  IB  for  all  l , j . 

13 

(3)  Smitn,  Panwalker  ana  Duaek  (1975)  consioerea  systems 
with  ordered  processing  time  matrices,  in  wnich  the 
relative  order  (in  terms  of  task  lengtn)  of  the  tasks 
in  every  job  is  the  same,  and  in  addition,  if  one  task 
of  job  is  less  than  the  corresponding  task  of  job 
J_ ,  then  every  task  of  job  J^  is  less  tnan  the 
corresponding  task  of  job  . 

(4)  The  system  of  Burns  and  Rooker  (1975)  must  satify  the 
condition  that  the  product  of  MIN ( , B^ ) -MIN ( , Cj )  and 
MIN (Aj , B^) -MIN (B  , C^)  hp  non-negative.  Their  algorithm 
is  easily  seen  to  require  O(n^)  time  at  most. 

More  recntiy  (1978) ,  they  gave  an  O(nlogn) 
algorithm  tor  the  case  in  which  B . £MIN (A^ , C^)  for 
all  i . 

(5)  Swarc  (1977)  also  gave  an  O(nlogn)  algorithm  for  the 

case  in  wnich  B.=B.  and  if  A.iAn  then  C.2C  for  all  i 

i  J  13  13 

and  j  . 

Another  case  due  to  Swarc  can  be  described  as 
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follows.  Let  permutation,  p,  oe  optimal  for  the  two 

stage  problem  in  wnich  300  has  tasks  A^+B^,  &i+Ci' 

with  associated  optimal  length  U^.  If 

U  =  L  +  Zn  ,  ( B  •  )  , 
p  p  i=l v  i'  ' 

then  p  is  optimal  for  the  three  stage  problem,  where  L 

Jr 

is  the  length  of  the  three  stage  schedule  using 
permutation  p. 

For  the  general  problem  with  m22  several  heuristic 
methods  have  also  been  devised  (see  Baker,  1974).  However, 
tne  problem  is  NP-complete  as  shown  by  Gonzalez,  Johnson  and 
Sahni  (1976)  ana  Gonzalez  ana  Sahni  (1978) . 

In  the  following  sections,  several  new  results  are 
given  for  some  interesting  special  three  stage  flow  shops. 

4.2  J-Maximal  and  J- Minimal  Flow  Shops 

A  flow  shop  is  saia  to  be  3 -maximal  (3 -minimal )  if  the 
3-th  tasK  of  each  30b  is  not  smaller  than  (greater  than)  any 
other  tasK  of  the  same  30b.  This  criterion  is  far  less 
restrictive  than  tne  oraerea  processing  time  flow  shop  of 
Smith,  Panwaiker  ana  Duaek  (1975) .  J-maximal  and  3-minimal 
flow  shops  have  been  studied  by  Chin  ana  Tsai  (1978)  ana 
bounas  on  the  performance  ot  tne  worst  solutions  as  compared 
to  the  best  possible  were  aerivea. 

It  is  easily  checked  that  known  proofs  of 
wp-compie teness  of  the  minimal  length  flow  shop  scheduling 
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problem  (Gonzalez,  Johnson  &  Sahni,  1976;  Gonzalez  &  Sahni , 
1978)  involve  the  setting  up  of  flow  shops  wnich  do  not  have 
the  j-maximal  or  j-minimal  criterion  for  any  value  of  j. 
hence,  their  results  go  not  carry  over  to  the  cases  unaer 
cons iaera t ion . 

In  the  following,  it  is  shown  that  the  three-stage 
problem  remains  NP-complete  except  for  the  2-minimal  case 
wnicn  is  solved  in  O(nlogn)  steps. 

Chin  and  Tsai  show  tnat  the  2-minimal  problem  is 
iMP-compiete  under  the  assumption  that  it  a  job  has  a 
zero-length  task  on  a  certain  processor ,  then  the  job  does 
not  have  to  visit  that  processor.  With  this  interpretation 
of  zero-length  tasks  it  is  no  longer  true  that  an  optimal 
schedule  may  be  found  among  the  permutation  schedules, 
however,  a  more  realistic  interpretation  of  zero-length 
tasks  tor  the  flow  shop  model  is  to  consider  them  as  tasks 
with  infinitesimal  time  requirement.  Thus,  eacn  job  must 
visit  every  processor  even  when  a  job  has  a  zero-length  task 
for  a  processor.  The  latter  interpretation  is  adopted  in 


tnis  thesis. 


4.2.1  2-Minimal  Flow  Shop:  A  Solvable  Case 


Given  a  permutation  p  of  the  first  n  integers, 

(P ( 1)  f P ( 2) , . . . ,p (n) ) ,  and  any  u,  v  with  l£u£v£n,  the  number 

VU'V>  =  Ap(i)  +  ZI=U  Bp(i)  +  Zi=v  Cp ( i )  is  a  lower 

bound  on  the  length  of  the  schedule  derived  from  p.  In  tact, 

the  length  or  that  schedule,  L  ,  is  MAX{L..  (u,v)  I  l^uiv^n}  . 

y  ir 

This  follows  from  the  fact  that  30b  p(u)  (that  is,  the 
p(u)-tn  30b  or  to  be  exact)  cannot  start  until  after 

processor  1  has  executed  stage  1  of  all  preceding  300s,  task 
£  .  ,  cannot  start  until  processor  2  finishes  all  previous 
stage  2  tasks  (in  particular,  those  between  p(u)  ^d  p(v)), 
and  finally,  after  the  processing  of  task  C  .  ,  ,  tne  third 
processor  has  to  do  the  remaining  third-stage  tasks.  This 
lower  bound  with  different  u  and  v  will  be  used  repeatedly 
in  the  following. 


Johnson's  (1954)  proposal  for  the  three-stage  problem 
is  to  schedule  the  i-th  30b  before  the  3-th  30b  if 

MIN(Ai  +  Bi,  +  0^)  £  MIN ( +  B^  ,  Bi  +  C±)  - (1) 

One  of  the  special  cases  solved  by  the  above  rule  as 
recently  demonstrated  by  Burns  ano  Rooker  (1978)  is  (by 
definition  of  3-minimal)  a  2-minimal  flow  shop.  The  proof  is 
straightforward  and  is  sKetcneo  oelow  for  completeness. 


Theorem  4.1  (after  Burns  &  Rooker,  1978)  :  The  2-minimal 
three-stage  flow  shop  scheduling  problem  can  be  solved  in 
O(niogn)  time  by  the  application  of  Jonnson'^  rule  (1). 
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Proof :  Let  p  =  (p ( 1) , . . . ,p (n) )  be  a  permutation  of  the 
first  n  integers.  Consider  the  two-processor  flow  shop  with 
n  jobs  such  that  tor  the  i-th  jod,  the  task  on  the  first 
processor  takes  (A^+B^)  time  and  the  other  task  takes 
( B i+C i ) .  Then  rule  (1)  aoove  simply  applies  Johnson's 
optimal  procedure  for  two  processors.  In  other  woras ,  rule 
(1)  fmas  a  permutation  which  minimizes 

MAX { L  (i ,])  |  llj^nj.  Therefore,  in  order  to  prove  the 

P 

theorem  it  is  sufficient  to  show  that 

L  (u,v)  *MAX{L  (u,u),  L  (v,v)},  —(2) 

for  all  u<v  ana  all  permutation  p. 

Suppose  (2)  aoes  not  hold. 

Then,  L  (u,v)  >  L  (u,u)  ana  L  (u,v)  >  L  (v,v), 

P  p  P  P 

tor  some  u,  v.  This  yields 


and 


Zi=u+l(Bp(i) f  >  2i=u(Cp(i)i 

OpuP  >  2Lu+i<AP(i)»' 


iron,  which  one  obtains  Bp(v)  >  Cp(u)  ana  Bp(u)  >  Ap(v). 

Since  in  a  2 -minimal  flow  shop  A^  2  ana  2  B^,  lLi£n, 

it  iollows  that  Bp(v)  >  Cp(u)  i  Bp(u)  >  Ap(v).  Hence, 

B  ,  .  >  A  ,  .  which  is  a  contr aaict ion . 

P(v)  p(v) 


An  u(nlogn)  implementation  of  the  given  rule  is  easily 


envisaged. 
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4.2.2  NP- Complete  Cases 

The  easy  solution  of  the  2-minimal  case  for  three 
processors  might  ieao  one  to  conjecture  that  other  j-maximal 
or  j -minimal  cases  might  just  as  easily  be  solved.  In  this 
section  it  is  shown  that  this  is  definitely  not  the  case. 

Trie  NP-completeness  of  the  2-maximal  three-stage  flow 
shop  problem  is  easily  demonstrated  by  a  reduction  from 
PARTITION  (see  Section  1.2.2). 


Theorem  4.2:  The  2-maximal  three-stage  flow  shop 
minimal  length  scheduling  problem  is  NP-complete. 


Proof :  Given  an  instance 
following  2-maximal  flow  shop 


A  . 

l 

■  ci 

=  0, 

for 

l  <: 

B  • 

l 

=  ai' 

for 

1  <: 

i  £ 

A  ,  =  B  ,  ,  =  C  ,  .  = 
n+1  n+1  n+1 

The  deadline  for  the  flow  shop 


of  PARTITION , 
containing  n+ 
i  £  n. 

n . 

K. 

prooiem  is  D 


oef ine 
1  jobs . 


=  3K . 


the 


By  considering  Figure  lb,  it 
exists  a  permutation  scneoule  for 
ooes  not  exceed  3K  if  and  only  if 
a  solution. 


is  obvious  that  there 
the  flow  shop  whose  length 
the  PARTITION  instance  has 

0 


Incidentally,  the  flow  shop  of  Theorem  4.2  is  also 
1-minimal  as  well  as  3-minimal  and  serves  to  show 
NP-completeness  of  these  cases.  They  were  not  included  in 
the  statement  of  that  theorem  because  stronger  reductions 
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from  3-PARTITION  will  oe  presented. 


Lemma  4.1: 
flow  shop: 


Consider  the  following  1-minimal  (n+2)-job 


A  .  = 

l 

2  (  i  -  1 )  K , 

1 

1 

i 

£  n+1 ; 

B  .  = 

l 

(2i  -  1 ) K , 

1 

£ 

i 

£  n+1; 

C  .  = 

l 

2  (  i  +  1)  K , 

1 

<: 

i 

1  n+1 ; 

An+2 

=  C  =  0 

n+z 

Bn+2 

=  Cn+1  =  2 

(n 

+ 

2) 

K;  for 

The  permutation  (1,...,  n+2)  is  the  unique 
permutation. 


0. 

optimal 


Proof :  See  Figure  19  for  the  case  n  =  4.  Note  that  the 
given  scneduie  nas  total  idle  time  of  length  K  on  the  third 
processor,  and  that  for  any  schedule  the  third  processor 
must  oe  at  least  idle  during  tne  execution  of  the  first 
second-stage  task.  Since  K  is  the  minimum  second-stage  task 
the  given  permutation  must  be  optimal. 


Now,  consider  an  arbitrary  optimal  permutation 

(p ( 1)  , . . .  ,p  (n  +  2) ) .  If  p(l)  £  1  then  B  .  >  K  and  there  is 

P  1 1  / 

idle  period  greater  than  K  on  the  third  processor, 
contradicting  optimality  of  p.  Hence,  p(l)  must  be  1. 
Similarly,  if  p(2)  ?  2  then  >  ^p(i)  =  anQ  extra 

idle  period  is  generated  on  the  tniro  processor.  Hence,  p(2) 
must  oe  2.  Continuing  in  this  manner,  it  is  clear  that  the 
permutation  p  must  be  precisely  the  one  given  in  the  lemma. 
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Note  that  the  length  of  the  optimal  schedule  is 
(n^  +  5n  +  6) K  and  that  any  other  permutation  yields  a 
scneaule  strictly  longer  than  this. 

Theorem  4 . 3 :  The  1-minimai  flow  shop  minimal  length 
scheduling  problem  is  NP-complete. 

Proof :  Given  an  instance  of  3-PARTITION  with  the  set 
{a1 , . . .  ,a^n } ,  the  ai  summing  to  nK  and  K/4  <  a^  <  K/2, 
construct  the  following  instance  of  a  1-minimal  flow  shop 
minimal  length  scheduling  problem. 

Use  all  the  jobs  of  Lemma  4.1  ana  include  the 

following :  An+2+;|  =  Cn+2+J  =  6,  Bn+2+]  =  for  l£j£3n. 

2 

The  target  schedule  length  is  D  =  (n  +  5n  +  6)  K,  the  length 
of  the  optimal  schedule  of  Lemma  4.1. 

The  unique  optimal  scneaule  of  Lemma  4.1  leaves  n  idle 
periods  each  of  length  exactly  R  into  wnich  the  3-element 
partitions  (stage  2  of  the  new  jobs)  can  be  placed  if  the 
3-PARTITION  problem  has  a  solution.  Thus,  there  exists  a 
schedule  whose  length  is  D. 

Conversely,  by  the  uniqueness  property  in  Lemma  4.1, 
the  only  type  of  schedules  for  the  flow  shop  which  can 
possibly  finisn  oy  time  D  are  those  which  contain  the  jobs 
or  Lemma  4.1  in  their  optimal  order  as  given  in  E'igure  19. 
That  order  can  therefore  be  considered  as  a  skeleton  of  any 
optimal  scneaule.  Consiaer  the  possible  placements  of  the 


97 


new  jobs  in  the  skeleton.  It  is  impossible  to  insert  the  new 
jods  at  either  end  without  lengthening  the  schedule. 
Furthermore,  the  first  set  of  n+1  jobs  leave  n  periods  on 
processor  2  into  which  the  remaining  can  be  placed  only  if 
set  { a }  has  a  3-partition  (note  that  the  K/4  <  a^  <  K/2 
property  of  the  a^^  is  essential  here)  .  Q 

Theorem  4.4:  The  3-minirnal  flow  shop  minimal  length 
scheduling  problem  is  wp-complete  . 


proof:  Similar  to  Lemma  4.1  and  Theorem 

lemma  part  use  the  n+2  jobs: 

Ai  =  2(i  +  1) K,  1  £  i  £  n+1; 

B  .  =  (2i  -  1)K,  1  i  i  i  n+1; 

Ci=2(i-1)K,  l£i£  n+1; 

A  „  =  C  „  =  0 ,  B  0  =  A  ,  =  2  ( n 

n+z  n+z  n+z  n+1 

The  unique  optimal  permutation  is  (n+2,  n+1, 
same  set  of  jobs  as  for  Theorem  4.3  is  added 
set  of  n+2  jobs. 


4.3.  For  the 


+  2)  K. 

.  . . ,  1)  .  The 

to  the  aoove 

0 


Lemma  4.2:  Consider  the  following  i-maximal  (n+i)-job 
flow  snop: 

Ai  =  (iZ  +  3i  +  4)K/2, 

=  (i  +  i  +  4 )  K/  2 , 

Ci  =  (i2  -  i  +  2) K/2 ,  for  1  i  i  ^  n+1,  K>0. 

The  permutation  (n+1,  n,  ...,  2,  1)  is  the  unique  optimal 
permutation . 

Proof:  See  Figure  20  for  a  schedule  with  the  tasks  in 


■ 


98 


c 

CO 

0 

II 

_J> 

c 

p 

o 

p 

OJ 

D 

• 

£ 

zF 

L 

© 

o 

a 

£ 

£ 

© 

o 

_J 

£ 

_> 

<P 

P 

O 

CL 

O 

© 

© 

D 

D 

“O 

C r 

© 

JZ 

C 

o 

ZD 

(0 

•  • 

O 

C\J 


LlJ 

CC 

CD 

i — i 


99 


the  specified  oraer  for  n=3.  The  optimality  of  the 
permutation  given  in  Figure  20  can  De  derived  from  the  fact 
that  this  permutation  is  optimal  on  the  first  two 
processors,  (i.e.  considering  the  2-processor  flow  shop  with 
each  job  consisting  of  the  first  two  tasks)  and  the  fact 
that  the  optimal  schedule  length  on  the  three  processors  is 
no  less  than  the  optimal  schedule  length  on  the  first  two 
processors  plus  the  smallest  task  length  on  the  third 
processor.  Since  the  length  of  the  schedule  in  Figure  20  is 
the  sum  of  the  optimal  schedule  for  the  first  two  processors 
and  the  smallest  task  length  on  the  third,  the  given 
permutation  is  optimal. 


notice  tnat  the  length  of  the  given  permutation 


schedule  is 


L  =  A  +  Zn+^ (B . )  +  C, 
opt  n+1  i=l  l  1 

=  (n~*  +  9n^  +  38n  +  48)K/6. 


- (3) 


It  remains  to  prove  the  uniqueness  property.  Consider 
the  given  optimal  permutation  p.  p(l)  must  be  n+1  for 
otherwise  the  length  of  the  corresponding  schedule,  L  ,  is 


bounced  as  follows: 


L  2  L  .  ( u , n  + 1 )  =  IU  A  ...  +  Zn+1B  ...  +  C 


p  p 

where  p(u)  =  n+1,  u>l.  But  C 


■  1 D 

i=l~p(i)  i=u  p ( i ) 


p (n+1) 


p (n+1)  1  C1  ana  AP(1)  >  BP(i)' 


tor  any  i.  Hence,  by  comparing  with  the  expression  for  L0^t 

in  Equation  (3) ,  Lp  >  L  t,  a  contradiction.  Thus, 

p(l)  =  n+1.  Similarly,  it  can  oe  snown  that  p(n+l)  =  1. 

Thus,  the  permutation  p  starts  with  n+1  and  finishes  with  1. 
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I'hp  ’'pst  of  the  proof  consists  of  snowing  Oy  inauction  that 
n  must  precede  n-1  and  then  that  n-1  must  precede  n-2  ana  so 
on.  Inciaentaily ,  the  tact  that  p(l)  =  n+1  implies  that  n+1 
must  preceae  n.  Assume  that  3  +  1  precedes  3  for  u+llj<n+l. 

Let  p(s)  =  u+1.  A  lower  oouna  on  the  lengtn  of  schedule  p  is 

—  (4) 


■s  „  .  2^+lc 


L  ( 1 ,  s )  =  A  .  +  Z .  ,B  ...  +  ^ ,  ... 

p  n+1  1=1  p(i)  i=s  p(i) 


Now ,  C  .  =  Cn  and  the  summation  of  the  ,  in  Equation 

p  (n+1)  1  P  ( 1 )  M 

(4)  includes  tne  second  stage  of  all  30bs  3  for  32u+l  by 


n  +  1. 


hypothesis.  If  in  addition  u  precedes  u+1  then  Z^_sCr ,  ^  1 

a  sum  of  C^+^,  C^,  and  a  subset  of  {C0,  ...,  Cn_1}. 


u-1- 


For  1  1  i  £  n+1, 

Bi  -  ci  =  (i2  +  i  +  4)K/2  -  (i2  -  i  +  2)  K/2 
=  (  i  +  1) K. 

Tnerefore,  =  "  c-[)  =  (u(u  “  D/2  +  u  -  1)  K 

=  (u2  +  u) K/2  -  K. 

Hence,  =  (u2  +  u)K/2  -  K  +  Z^.^C^ 

=  (u2  +  u) K/2  +  ZU~ic-  (note  C,  =  K) 

1  ”  /  • 

<  (uz  +  u  +  2) K/2  +  2Ci 

=  C  +  ZU_^C 
Cu+1  2‘i  =  2Li‘ 

Notice  further  that  the  inequality  holds  if  B^  is  removed 
from  the  left  ana  from  the  right  for  1<3 <u  since  B^  2  . 

Thus,  the  sum  of  B^  ana  any  subset  of  {B^,  ...,  Bu_^}  is 

strictly  less  than  the  sum  of  ^u+1  and  the  corresponding 
subset  of  •••'  Cu_i) . 


Comparing  the  expressions  for  L  (l,s)  and  LQpt,  it 

follows  that  L  ( 1 , s)  >  L  .  .  Hence,  either  u+1  precedes  u  or 

p  op  t 


■ 


a  contradict  ion  is  obtained.  By  induction  this  property 
nolos  for  all  u. 
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Theorem  4.5:  The  1-max imal  flow  shop  minimal  length 
scheduling  problem  is  NP-complete. 

Proof :  Given  an  instance  of  3-PARTITION  with  set 
{a1#  ...,  a^},  the  a^^  summing  to  nK  and  K/4  <  ai  <  K/2, 
construct  the  following  instance  of  a  1-maximal  flow  shop 
proolem . 

use  ail  the  jobs  in  Lemma  4.2  and  include  the 
following . 

An+l+j  ~  ^n+l+j  ~  aj  '  Sn+l+j  ~  ^  '  £or  ^  3  ^  3n. 

The  target  schedule  length  is  D  =  (n^1  +  9n2  +  38n  +  48)K/6, 

which  is  tne  optimal  schedule  length  of  Lemma  4.2. 

After  scheduling  the  first  n+1  jobs  as  specified  in 
Lemma  4.2,  n  equal  periods  of  length  K  can  oe  obtained  on 
processor  1  shifting  the  first  stage  tasks  as  far  right 
as  possible  (this  has  been  done  in  Figure  20)  ano  there 
are  n  similar  periods  appropriately  positioned  on  the  third 
processor.  Thus,  the  final  3n  jobs  can  be  scheduled  in  these 
spaces  if  the  3-PARTITION  problem  has  a  solution.  Observe 
that  the  resulting  schedule  is  indeed  a  permutation 
schedule . 

Conversely,  it  is  only  necessary  to  consider  schedules 
for  the  flow  shop  which  contain  the  first  n+1  jobs  in  their 


' 
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optimal  order  for  otnerwise  the  deadline,  D,  will  definitely 
De  exceeded,  by  the  uniqueness  property  of  Lemma  4.2.  It  is 
obvious  that  none  of  the  final  3n  jobs  can  be  inserted  into 
tne  skeleton  oefore  tne  first  job  (i.e.  n+1)  or  after  the 
last  job  (job  1).  Furthermore,  if  some  of  the  final  3n  jobs 
with  total  processing  time  at  the  first  stage  exceeding  K 
are  inserted  between  any  two  jobs  of  the  sKeleton, 
additional  idle  periods  will  be  generated  on  the  second 
processor  (see  Figure  20)  and  the  schedule  becomes 
suboptimal.  Hence,  a  schedule  of  length  D  can  be  obtained 
only  if  the  3-PARTITION  problem  nas  a  solution.  0 

Theorem  4.6;  The  3-maximal  flow  shop  minimal  length 
scneouling  proolem  is  NP-compiete. 

Proof:  Similar  to  Lemma  4.2  ano  Theorem  4.5.  For  the 
lemma  part  use  the  jobs, 

Ax  =  (iz  -  i  +  2) K/2 , 

Bi  =  (i2  +  i  +  4)K/2, 

C.  =  (i2+3i+4)K/2,  1  1  i  1  n+1. 

l 

The  unique  optimal  permutation  is  (1,  2,...,  n+1).  The  same 
set  of  jobs  as  for  Theorem  4.5  is  added  to  the  above  set  of 
n+1  jobs.  0 


. 
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4.3  Qraerea  Three  Stage  Flow  Shops 

The  results  of  the  previous  section  show  that,  with  out 

one  exception,  the  minimal  lengtn  scheduling  problem  for 

3 -maximal  ana  3-minimal  three-stage  flow  shops  is 

NP-comple te.  In  this  section,  the  flow  shop  is  further 

restricted  so  that  it  is  3-maximal  ano  at  the  same  time 

k-minimal  for  some  l£kl3.  Thus,  for  each  30b  the 

task  on  a  certain  processor  is  the  largest  and  the  task  on 

one  of  the  remaining  two  processors  is  the  smallest.  This 

leads  to  an  ordering  of  the  tasks  A.,  B.,  C.  for  each  30b  i. 

111 

Note  that  the  resulting  shop  is  still  less  restrictive  than 
the  case  considered  by  Smith,  Panwalker  and  Duoek  (1975). 

Let  L  stand  for  the  processor  with  the  largest  task  of 
eacn  300,  S  the  processor  with  the  smallest  task,  and  M  (for 
medium)  for  the  remaining  processor.  Then  flow  shops  of  tne 
type  described  above  can  oe  classified  as  follows: 


processor 

123 

LMS  =  1 -maximal 

& 

3-minimal 

LSM  =  1 -maximal 

& 

2 -minimal 

type  of 

MLS  =  2-maxima 1 

& 

3-minimal 

flow  shop 

MSL  =  3-maximal 

& 

2-minimal 

SLM  =  2-maximal 

& 

1 -minimal 

SML  =  3-maximal 

& 

1 -minimal 

' 
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It  is  clear  that  the  O(nlogn)  algorithm  for  scheduling 
the  2-minirnal  flow  shop  works  for  LSM  and  MSL  shops. 
Similarly,  the  proof  of  NP-completeness  for  the  2-maximal 
flow  shop  applies  to  MLS  ana  SLM  shops,  however ,  the  LMS  ana 
SML  flow  shops  present  a  new  problem.  In  the  following 
sections,  0(n&)  algorithms  are  given  for  scheduling  these 
types  of  flow  shop.  First,  consider  LMS  flow  shops. 


4.3.1  LMS  Flow  Shops 

In  computing  the  expression  L  ,  the  length  of 

permutation  p  (i.e.  the  length  of  the  schedule  derived  from 

permutation  p),  the  following  convention  is  adopted.  If 

L  (x,y)  =  L  (r,s)  and  s>y  then,  use  L„(r,s)  when  the  actual 
P  P  P 

indices  that  yield  L  .  are  of  particular  interest. 

P 

A  set  of  n  numbers  {b^  I  l^i^n}  is  said  to  be  almost 
sorted  in  descending  order  if  for  any  B ^ ,  l£i<n,  there 
exists  at  most  one  value  of  j,  i<j£n,  such  that  B^  <  . 


It  will  he  shown  that  there  exists  an  optimal 

permutation  p  with  L  =  L  (u,v)  which  satisfies  the 

P  P 

following  conaitions: 

U)  Bp(i)  *  bp(i+i)  =>  Cp(i)  2  Cp(i  +  i)'  111<n-  In 

particular  Bp(i)  <  Bp(.+1)  ->  Cp(.}  >  Cp(.+i),  lii<n. 

(2)  Cp(v)  >  Bp(k)'  v<kin- 
l3)  Bp(i)  2  Cp(v)  '  liiiv‘ 

(4)  {B  .  |  l£i£n}  is  almost  sorted  in  descending  order. 

P  I  i ) 


' 


■ 
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(5)  If  m  is  the  minimum  index  for  whicn  B  ,  .  <  B  ,  .  and 

p(m)  p(u) 

m<u,  then  B^ r  • • • *B (u_i)  sortea  in  descending 

oraer.  In  audition,  B  ...  L  B  ,  i>u,  m£j<u. 

p(i)  p(j )  J 

The  algorithm  constructs  an  optimal  permutation,  p, 
which  satisfies  all  of  the  above  conditions. 

Condition  (1)  indicates  that  in  the  scheduling  process, 
3 ods  with  the  same  execution  time  on  tne  second  stage  may  be 
arranged  in  descending  order  of  the  third  stage. 

Suppose  the  jod  which  will  be  in  the  v-th  position  in  p 
is  Known  (i.e.  in  computing  L^(u,v)  it  is  Known  in  advance 
whicn  jod  will  contrioute  the  second  index  v) .  Then,  the 
jobs  to  precede  p(v)  and  those  to  follow  job  p(v)  can  be 
determined  by  conditions  (2)  and  (3) .  Thus,  the  remaining 
n-1  jobs  are  partitioned  into  two  subsets 

(a)  those  jobs  with  B.  1C  ,  .  ,  and 

v  '  J  1  p (v) 

(b)  those  with  B.  <  C  ,  . 

1  p(v) 

The  jobs  in  subset  (a)  must  oe  executed  before  job  p(v) 
while  tnose  in  subset  (b)  must  be  executed  after  job  p(v). 
The  actual  value  of  the  index  v  will  then  be  determined  by 
tne  size  of  the  subsets. 

how,  suppose  further  it  is  known  in  advance  which  jod 
will  play  the  role  of  p(u) .  The  jobs  in  subset  (a)  can  then 
be  partitioned  further  as  follows: 

(c)  those  jobs  with  B ^  2  B  ^ ,  and 

(d)  those  with  B ^  <  B  ^ . 
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Now  tne  30bs  in  subset  (c)  must  precede  30b  p(u). 

(Otherwise ,  if  there  is  i ,  u<iiv,  with  B  >  B  ,  .  ,  then 

Lp(ifv)  >  Lp(u,v)  ana  L  (u,v).)  By  conditions  (4)  and 

(5),  it  k  30bs  in  subset  (d)  precede  30b  p(u)  they  must  be 
the  k  largest  in  (d)  and  must  immediately  precede  p(u) 
sorted  in  descending  order,  and  k  =  u-m. 

It  is  clear  that  the  aoove  conditions  are  not 
sufficient  to  determine  an  optimal  permutation.  There  are 
still  three  unknown  parameters,  p(v),  p(u)  and  the  number, 
k,  of  the  previous  paragraph. 

Therefore,  the  algorithm  tries  all  possiole 
combinations  of  the  following  choices: 

(I)  choose  30b  p  ( v)  ;  use  task  C  ,  x  *-o  partition  the 

p  (v)  ^ 

remaining  3obs  into  subsets  (a)  and  (b)  as  above. 

(II)  from  subset  (a)  choose  30b  p(u);  use  B  ^  to 
partition  the  remaining  30bs  in  (a)  into  (c) 
and  (d)  . 

(Ill)  choose  k  H  so  that  the  largest  k  tasks  of  subset 
(d)  will  precede  30b  p(u). 

Tne  decisions  I,  II,  III  will  be  said  to  be  optimal 
wnenever  tnere  exists  an  optimal  almost  sorted  permutation 
for  wnich  the  cnoices  are  correct.  It  is  shown  later  how  to 
ootain  a  permutation  p,  if  one  exists,  for  any  set  of 
decisions  I,  II,  and  III.  The  algorithm  obtains  all  such 
permutations  and  selects  the  shortest  one. 


>  i.t  -  V  > 
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First,  the  above  conditions  are  snown  to  hold  for  some 
optimal  permutation.  Conditions  (1)  and  (2)  are  proved  in 
the  following  lemma. 


Lemma  4.3:  There  exists  an  optimal  permutation  p  for  an 

LMS  flow  shop  such  that  if  B  ...  £  B  ,  .  .  then 

p(i)  p  d+1) 


Cp(i)  1  Cp(i+1)  - 


Proof :  The  proof  is  by  construction.  Suppose  p  is 

optimal,  Bp(i)  £  Bp(i+1),  ana  Cp(i)  <  Cp(i+1).  Consider 

permutation  q  obtained  from  p  by  interchanging  p(i)  ano 

p(i  +  l)  in  p.  we  show  that  L  £.  L^  by  showing  that  for  all  x, 

Q  P 

y,  l£x£y£n,  L  (x,y)  i.  L  (u,v)  for  some  u,  v,  l£u£v£n.  The 

q  p 

following  cases  exhaust  the  possible  values  of  x  and  y. 

CASE  1:  x  £  y  <  i.  L  (x,y)  =  L  (x,y). 

CASE  2:  x  <  i,  y  =  i. 

Lq(X,i)  =  Lp(x,i+1)  +  Cp(i)  -  Bp(i). 

Sinoe  Cp{i)  £  Bp(i),  Lq(x,i)  £  Lp(x,i  +  1). 

CASE  3:  x<i,  y  =  i  +  1. 


CASE  4: 
CASE  5: 


CASE  6: 


L  (x  ,  i  +  1)  = 
Since  Cp(i) 
x  <  i ,  y  >  i 
x  =  i,  y  =  i 
Lq(i,i)  =  Lp 
Since  Cp(i) 
x  =  i ,  y  =  i 
L  ( i , i+1)  = 


Lp(x,l+1)  +  Cp(i)  Cp(i+1) - 

<  Cp(i  +  D  '  Lq(X'1  +  1)  <  Lp(x'i  +  1>  ' 
+  1.  L  (x,y)  =  Lp(x,y) . 


(i+1, i+1)  +  Cp(1)  -  Ap(i) . 

b  Ap ^ ^ ^ ,  Lq(ifi)  ^  Lp ( i+1 ,  i  +  1) . 
+  1. 


Lp(i+l,i+l)  +  6p(i) 
Ap(i)  "  Cp ( i+1) * 


+  C 


P  ( i) 
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CASE  7 


CASE  8 


CASE  9: 


CASE  10 


Since  C  ...  <  C  . .  , 4  and  B  .  £  A  .  , 

P(i)  p(i  +  l)  p(i)  p  ( i)  ' 

Eg ( i f i+1)  <  L  ( 1+1 , 1+1) . 


x  =  i,  y  >  i  +  1, 

L  ( i f y)  =  L  (i+l,y)  +  B 


q 

Since  t> 


P ( 1 )  AP ( 1 )  * 


p(i)  £  Ap(i)  '  "q 


Ln(i,y)  £  Lp(i+l,y)  . 


x=i+l,  y=i+l. 

Lq(i+i'i  +  1)  =  Lp(i+1'i+1)  +  Bp(1)  +  Cp(i) 

6p(i+l)  "  Cp  ( i+1) • 

Since  C  ...  <  C  .  and  B  ...  £  B 

p ( i)  p ( l+l)  p ( i)  p ( i+i) ' 

L  (i+1, i+1)  <  L  (i+1, i+1). 

Si  IT 

x  =  i  +  1,  y  >  i  +  1, 

Lq(i+l,y)  =  Lp(i+l,y)  +  Bp(1)  -  Bp(i+1). 

SinCe  Bp(i)  £  Bp(i+1),  Lq(i+l,y)  £  Lp(i+l,y) 

i  +  1  <  x  £  y.  L  (x,y)  =  L  (x,y). 

Si  Jr 


Therefore,  L  £  L  and  q  must  be  optimal  for  otherwise 

Si 

tne  optimality  of  p  is  contradicted.  After  a  finite  number 
of  such  exchanges,  an  optimal  permutation  wnicn  satisfies 
the  lemma  is  obtained. 


Corollary :  There  exists  an  optimal  permutation  p  for  an 
LMS  flow  shop  such  that  if  B  ^.j  <  then 

Cp(i)  >  Cp  ( i+1)  ' 

Proof:  Same  as  the  proof  of  the  lemma  using  the 
inequalities  Bp(i)  <  Bp(i+1),  ana  Cp(i)  £  Cp ( i+1)  instead  of 

BPU)  1  Bp(i+D  an°  S(i)  <  CP(i  +  D-  0 


■ 
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Lemma  4.4:  Let  p  be  any  permutation  for  an  LMS  flow 
shop  with  length  Lp(u,v).  Then,  Cp(y)  >  Bp(R),  for  v<k£n. 

Proof:  Mote  that  in  an  LMS  flow  shoo  A.  i  B.  i  C .  . 

- -  v  !  i  i ' 

lli^n.  Since  length  of  p  is  L  (u,v),  L  (u,k)  <  L  (u,v),  for 

P  P  P 

v<k<Ln.  This  implies 

2i=lAp(i)  +  ^i=ubp(i)  +  ^i=kCp(i) 

<  ^i=lAp(i)  +  2i=u*p(i)  +  ^i=v^p ( i) ’ 

=>  Zi=v+lbp(i)  <  Zi=vCp(i) 

=>  Zi=v+lbp(i)  +  bp(k)  <  Zi=v+lCp(i)  +  Cp(v)* 

Since  B  ^  ^  Cp(i)  for  an^  1 '  it:  follows  that 

B  .  .  <  C  ,  .  . 

P  (k)  p(v) 

The  following  lemma  is  neeciea  to  facilitate  later 
proof  s . 


Lemma  4.5:  There  exists  an  optimal  permutation  p  for  an 

LMS  flow  shop  such  that  if  B  ...  <  B  then 

^  p  ( i )  p(i+D 

Cp(i,  >  BP(J)'  i+l<Din. 

Pr oof :  The  proof  is  again  by  construction.  Let  p  be 

optimal  with  <  Bp(j_  +  ]j  ana  such  that  the  permutation  q 

ootainea  oy  intercnang ing  p(i)  ana  p(i+l)  in  p  is  not 

optimal.  Vve  show  that  L^  =  L^(r,i+1)  for  some  r,  l£r£i+l,  by 

proving  that  L^(x,y)  ^  L^  for  y^i+l.  By  the  corollary  to 

Lemma  4.3,  C  ...  >  C  The  following  cases  (similar  to 

p(i)  P(i  +  1) 

Lemma  4.3)  are  obtainea. 


x  i  y  <  i.  L  (x,y)  =  Lp(x,y) 


CASE  1: 


. 

11  . £  J  j  ■  ■II  ■■  ■  , 
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CASE  2 :  x  <  i ,  y  =  i . 


CASE  3: 
CASE  4: 
CASE  5: 


CASE  6: 
CASE  7: 


CASE  8: 
CASE  9: 


CASE  10: 


Lq (x, i) 
Since  C 


P(i) 


(x , 1+1) 


£  B 


P(i) 


+  cpu)  BP(i)  • 

L  (x,  i)  <C  L  (x, i+1)  . 
4  P 


x  <  i,  y  =  i  +  1,  **  unknown  ** 

x  <  i,  y  >  i  +  1.  L  (x,y)  =  L  (x,y). 

si  y 

x  =  i  ,  y  =  i . 


Lq(i,i)  =  Lp ( i+1 , i+1) 
S^ce  Cp(i)  <  Ap(.)f 


+  C 


P  (i) 


L  (i,  i) 

q 


AP (  i )  * 

*  Lp ( i+1 , i+1) 


x  =  i,  y  =  i  +  1,  **  unknown  ** 
x  =  i,  y  >  i  +  1. 


Lq(i,y)  =  Lp ( i+1 ,y) 

Since  B  £  A  .  . , 

P  ( i )  PU) 

x=i+l,  y=i+ 


x=i+l,y>i+ 


+ 


bp(i) 


A  .  .  .  . 

P(l) 


,  L  (i,y)  £  L  ( i+1 ,  y) . 

Si  ir 


1.  **  unknown  ** 

1 . 


Vi+1'Y)  =  vi+i'Y>  +  BP(i)  -  BP<i+ir 

Since  Bp(i)  <  Bp(i  +  1),  Lq(i+l,y)  <  Lp(i  +  l,y). 
i  +  1  <  x  £  y.  L  (x,y)  =  L  (x,y). 

SI  ir 


Since  q  is  not  optimal,  L  >  L  Put  from  the  aoove 

q  p 

cases  it  is  clear  that  Lq(x,y)  £  Lp  whenever  y^i+1.  hence, 

there  exists  r,  lirli+1,  such  that  Lq(r,i+1)  >  Lp.  It 

follows  that  L  (r , i+1)  >  L  (r,j),  i  +  l<;j£n.  By  Lemma  4.4, 

q  q 

C  ril.  >B..,orC,.>B..,  i+l<3ln. 

4(i+l)  4  ( J )  P(i)  P(J ) 


It  is  convenient  to  first  prove  condition  (4)  in  the 
next  lemma  ana  use  the  result  in  proving  condition  (3). 


Lemma  4.6:  There  exists  an  optimal  permutation  p  for  an 


■ 


Ill 


LMS  flow  shop  such  that  ^  |  l£i£n]  is  almost  sorted  in 

descending  order. 


Proof :  The  proof  is  by  induction  on  the  size  of  the  set 

tbp(1),...,Bp(n)}. 


Initial  case:  is  almost  sorted  in  descending 


order . 


induction  step:  Suppose  {Bp ( i+1) , . . . ,B  } 

sorted  in  descending  order.  If  B  ^  B  ,  .  ,  , 

p ( 1)  p ( l+l) 

almost  sorted  condition  holds  for  positions  i,. 

the  other  hand,  B  <  B  ,  then,  by  Lemma 

C  ...  >  B  ,  j>i+l.  Now,  since  B  1C 

PU)  P  ( 3 )  P(i)  Pd) 

that  B  ,  . ,  >  b  ,  , ,  i >i+l . 

P(D  P ( J ) 


is  almost 
then  the 
.  ,n.  If ,  on 

4.5, 

it  follows 


Hence,  in  any  case,  {B  ,B  .  . }  is  almost  sorted 

P(i)  P(n) 

in  descending  order. 


Lemma  4.7  proves  a  stronger  result  than  condition  (3) 
which  is  stated  in  the  corollary. 

Lemma  4.7:  There  exists  an  optimal  permutation  p  for  an 

LMS  flow  shop  sucn  that  B  ...  1C  ...  for  l^i<3<ln. 

*  PU)  p  ( 3 ) 

Proof:  Let  p  be  an  optimal  permutation  satisfying 

Lemmas  4.3  to  4.6.  we  first  show  that  B  1  C 

p (3 -1)  P (3 ) 

l<]£n.  11  Bp(D-1)  1  Bp(j)'  then  Bp(3-1)  1  Cp(J)'  sinCe 
bp(j)  i  Sor  “  bp(]-d  <  Bp (3 ) '  then  cy  the  coro:Uary  to 

Lemma  4.3  >  Cp(j),  ana  since  1  Cp(]_1)(  we 
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obtain  Bp(-j_jj  >  Cp(j)*  'lhuS/  either  case, 

Bp(D-l)  *  cp(:i)  ' 

l\iow  consiaer  the  case  ot  arbitrary  indices  i  ana  j, 
l£i<  ( j -1)  < j^n.  This  case  is  trivial  it  2  B  ^  As  for 

<  Bp(j)'  since  ^Bp(i)  I  l^i^nj  is  almost  sorted  in 
descending  order,  B  ^  2.  Bp(k)f  i<k^n,  k  jtj .  In  particular, 

Bp(i)  i  £p(3-l)'  anQ  Bp(]-1)  1  Cp(3)'  (DY  pr°°£  °£  firSt 

part).  Hence,  Bp(i)  i  Cp(j).  1 

Corollary:  It  Lp  =  Lp(u,v)  then  B  .j  1  C  (v)  ,  liilv. 

Pr  oor ;  Follows  directly  from  the  lemma.  0 

The  final  condition  (5),  is  given  in  the  next  lemma. 

Lemma  4.3:  Let  p  be  an  optimal  permutation  for  an  LMS 

flow  shop  which  satisfies  Lemmas  4.3  to  4.7.  Let  the  length 

of  p  be  L  (u,v).  If  m  is  the  minimum  index  for  whicn 
P 

B  .  .  <  B  .  .  =>nd  m  <  u,  then  the  subsequence 

p(m)  p(u)  ^ 

B  .  .  ,  .  .  .  ,B  .  , ,  is  sorted  in  descending  order.  In 

p  (m)  '  '  p  (u-1)  r 

addition,  B  ...  £  B  ,  .  ,  i>u,  m£-j<u. 

P(i)  p(D)  J 

Proof :  This  follows  directly  from  the  almost  sorted 
condition.  fl 

This  concludes  the  proofs  of  the  five  conditions.  We 
now  show  how  to  obtain  a  permutation  p  which  corresponds  to 
a  set  of  decisions  I,  II  and  III,  if  such  a  permutation 


exists. 
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Recall  that  alter  making  the  decisions  I,  II  ana  III, 
the  following  situation  results. 

(1)  The  first  (m  -  1)  jobs  are  those  with  B .  2  B  .  , 

l  p(u) 

(2)  The  next  k  =  u  -  m  jobs  consist  of  the  k  jobs  which 

have  largest  second  stage  among  those  with  second  stage 

less  than  B  .  .  . 

p(u) 

(3)  Next  job  is  p (u)  . 

(4)  Job  p(u)  is  followed  by  the  remaining  jobs  with 


B  ,  .  >  B .  2  B  ,  ,  . 

p(u)  l  p(v) 

(5)  Next  comes  job  p(v). 

(6)  The  remaining  jobs  follow  job  p(v). 

The  second  group  may  be  empty  depending  on  decision  III.  The 
actual  values  of  the  indices  rn,  u  ana  v  are  determined  by 
the  sizes  of  the  above  sets.  The  first  step  is  to  sort  each 
group  in  descending  oraer  of  B_^,  using  descending  order  of 
C-  (Lemma  4.3)  whenever  there  is  a  tie  between  the  B..  This 

l  i 

results  in  an  initial  almost  sorted  permutation  p  which  is 
further  modified  by  the  following  procedure  ALSORT  where 
necessary . 


In  order  to  clearly  explain  the  operation  of  procedure 
ALSORT,  two  operations,  cycle  and  sort,  are  defined  on  a 
pair  of  indices  i,  j  as  follows.  Applying  the  operation 
cycle(i,j)  to  a  sequence  {B^,...,B^,... ,B^,...,Bnl  produces 
the  sequence  {b^  ,  .  .  .  ,B^+-^ ,  .  . .  ,B  ,B^ ,  .  . .  ,Bn)  .  Applying 

the  operation  sor t ( i , j )  to  the  above  sequence  reverses  the 


effect  of  the  previous  cycle  operation.  In  other  words. 
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cycle  rotates  the  elements  in  positions  i  through  3  one 
place  to  tne  leit  ana  "sort"  rotates  them  one  place  in  the 
opposite  direction.  Intuitively,  cycle  operations  will  be 
pertormea  on  tne  permutation  initially  obtained  by  sorting 
the  jods  as  inaicatea  above  so  as  to  reauce  the  iength  of 
the  scheaule. 

The  almost  sorted  conaition  implies  that  there  exists  a 
disjoint  set  of  inaex  pairs  {(s(i),t(i))  |  liiii} ,  where 
s(i-l)  <  t(i-l)  <  s(i)  <  t(i),  such  that  p  may  be  obtained 
from  tne  permutation  supplied  to  ALSORT  by  applying 
cycle  (s  (  i)  - 1 ( i) ) ,  l£i£l.  The  number  of  pairs,  I,  is 
initially  unknown  but  bounded  by  n/2.  It  will  be  shown  that 
for  an  optiinai  set  of  aecisions  I, II, III  proceaure  ALSORT 
will  fina  an  optimal  inaex  set  and  perform  the  requirea 
cycle  operations. 

For  any  permutation  p  let  E(i,j)  oe  tne  earliest  time 
tnat  tasK  i  of  job  p(j)  can  be  finishea  ana  Y(j)  the  sum  of 
execution  times  of  all  thira  stage  tasks  following  Cp  ^  ,  in 
tne  permutation  schedule  corresponding  to  p.  A  proceaure  to 
calculate  ana  update  the  values  of  E(i,j)  and  Y(j)  is 
needea.  Set  E(i,0)  =  E  ( 0 ,  j )  =  &,  0£ii.3,  0£j£n,  ana  set 
Y ( n)  =  0.  Then,  the  values  of  E(i,j)  ana  Y (j )  can  be 
computed  with  the  relations: 

Y  ( j  )  =  Y  (J  +1)  +  Cp(;]+1),  0^<n'  and 

E  (  i  ,  J  )  =  MAX  {  E  (  i  - 1 ,  j  )  ,  E  (  i ,  j  - 1 )  }  +  t,  l£i£3,  lijin, 

wnere  t  is  A  (  )#  Bp(j)  or  Cp(j)  depenaing  on  i.  This 
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computation  is  provioeo  by  procedure  LENGTH (p , x , y) 

(Procedure  5)  which  calculates  E(i,j),  l£i£3,  x<lj£y,  and 
1(1),  x^i £y ,  with  the  assumption  that  the  boundary  (i  =  0, 
l<x,  i >y )  contains  appropriate  values.  Clearly,  this 
procedure  requires  only  G(y-x+l)  time.  Note  that  E(3,n)  =  L 

P 

arid  E  (  3  ,  i )  +  I(i)  =  MAX {  L  (  i  ,  l )  I  Ki^i  )  . 

p 

Once  the  decision  set  I,  II  and  III  has  been  specified, 
L  (u,v)  can  be  calculated.  Thus,  ALSORT  starts  oy  computing 

ir 

L  (u,v).  Essentially,  the  algorithm  seeks  an  almost  sorted 

lr 

permutation,  p,  with  L  =  L  ( u , v) .  This  is  done  by  ensuring 
that  MAX { Lp ( r , j )  I  lir^i)  £  L  (u,v),  l^i^n.  Thus,  for  each 
position  i  starting  from  1  up  to  n  it  checks  if 

MAX { Lp ( r , j )  |  l<Lr<Lj]  >  L  (u,v),  - (5) 

or  E (3  ,i)  +  Y(l )  >  L  (u,v) . 

There  are  two  cases  to  be  considered  depending  on  the 
pos ition  i ,  lijin. 

Case  1:  m^j^u  (or  i=u,  when  u<m)  or  j=v.  For  this  case, 
if  the  Inequality  (5)  holds  then  the  decision  set  1,  II,  III 
cannot  be  optimal  i.e.  there  does  not  exist  an  almost  sorted 
sequence  p  with  L  =  L  (u,v).  The  procedure  CHECK(p,x,y) 

p  p 

(Procedure  6)  is  used  to  check  that  the  Inequality  (5)  does 
not  hold  between  any  two  given  positions  x  and  y.  The 
logical  value  returned  by  the  procedure  indicates  whether  or 
not  the  condition  has  oeen  detected  in  the  given  range.  This 
procedure  also  takes  G(y-x+l)  time. 
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Procedure  LENGTH (p ,  x ,  y)  ; 

1.  Degin  comment  KUn  and  E(i,3),  0ii^3,  0i3ln 

and  Y(3),  1^3^n,  are  global  variables. 

Ai'Bi'Ci  are  task  lengths  for  the  LiViS  flow  shop; 

E(i,3)  is  earliest  time  task  i  of  job  3  can 
finish.  E(i,0)  ana  E(0,3),  0£i£3,  0£ jin,  are 
initialized  to  0. 

Y(j)  is  the  sum  of  task  lengths  C  , ,,  ,  3<km. 
Y(n)  is  initializea  to  0.  ^  ' 

p  is  the  permutation  unoer  consideration. 

E(i,3)  and  Y ( 3 )  are  upaatea  for  xl3ly. 


z  • 

tor 

3  :=  x  until  y 

do 

3  . 

begin  for  i  := 

1  until  3  do 

4  . 

begin  t  := 

(A  .  .  ,£  .  . ,C  .  , )  depending  on 
P  (3  )  P  (3  )  P  (3  ) 

5  . 

E ( i  ,  3  )  :  = 

maximum{E(i-l,3 ) ,E(i, j-1)  )  +  t; 

6. 

eno ; 

7. 

end ; 

8  . 

for 

3  :  =  y-1  step 

-1  until  x— 1  do 

9  . 

end ; 

Y  ( j )  :=  Y(j+1) 

» 

r 

+  c 

P(3+D  ' 

PROCEDURE  5 


Case  2 :  ll3<m  (or  ll3<u  if  u<m)  ,  u<3<v  or  v<3ln.  For 

tnese  positions,  if  the  Inequality  (5)  holds  at  3,  it  is 

still  possible  that  there  exists  an  almost  sorted 

permutation  with  =  L  fu,v).  To  discover  such  a 

P  p 

permutation,  an  attempt  is  made  to  perform  a  cycle (i, 3) 
operation,  where  lli<j<m  (or  lli<3<u)  or  u<i<3<v  or  v<i<3ln, 
depending  on  wnich  01  above  ranges  3  falls  in,  so  that  the 
inequality  no  longer  nolos  for  all  positions  from  1  to  3 
inclusive.  This  may  require  that  a  previous  cycle  operation 
be  reversed  (a  sort  (i, 3)  operation)  in  order  to  accommodate 
a  cycle  operation  at  a  higher  index  since  the  permutation 
must  at  all  times  remain  almost  sorted.  These  checks  and 


■ 


117 


procedure  CHECK (p , x , y) ; 

1.  begin  comment  procedure  checks  to  ensure 

L  (r ,  j )  £  L  (u,v)  x:Lj£yf 

ir  ir' 

Logical  value  FLAG  indicates  success  or  failure, 
LEN  is  L  (u,v) . 

Other  variables  are  as  in  procedure  5. 

2.  FLAG  :=  true; 

3.  LENGTH ( p, x, y) ; 

4.  }  :=  x; 

5.  while  FLAG  and  j  £  y  oo 

6.  begin  if  E(3,j)  +  Y(3)  >  LEN  then  FLAG  :=  false; 

7  .  3  :  =  j  +  1  ; 

8.  end; 

9.  return  FLAG ; 

10.  end; 


PROCEDURE  6 


necessary  cycle  operations  are  earned  out  with  tne 
procedure  CHECKCYCLE (p , x , y )  (Procedure  7).  wote  that  after  a 
cycle  (i, 3)  operation,  this  routine  uses  tne  CHECK  procedure 
to  test  if  (5)  no  longer  nolds.  It  is  easily  seen  that  tne 
procedure  takes  no  more  than  0(nz)  time. 


Thus,  procedure  ALSORT  (Procedure  8)  computes  L  ( u , v ) 
and  with  the  aid  of  tne  procedures  discussed  above,  attempts 
to  find  a  permutation  p  with  L  =  L  ( u , v)  oy 

ir*  ir 

(1)  checking  that  Inequality  (5)  is  not  true  for  mlj^u  (or 
j=u  for  u<m)  ano  j=v,  ano 

(2)  ensuring  (performing  appropriate  cycle  operations  wnen 
necessary)  that  Inequality  (5)  does  not  hold  for  ICj <m 
(or  l£j<u  for  u<m),  u<j<v,  and  v<j^n. 


In  order  to  prove  that  procedure  ALSORT  worKs  as 
requirea,  tne  following  lemma  wnicn  is  in  a  sense  a  stronger 
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procedure  CHECKCYCLE (p , x , y) ; 

1.  begin  comment  proceaure  performs  the  same  checks  as 

procedure  CHECK  but  tries  to  perform  CYCLE 
operations  at  problem  spots; 

A  stack  of  cycled  index  pairs  is  maintained  in 
arrays  S-ST(*)  and  T-ST(*)  with  the  top  pair  being 
S-ST ( TOP)  and  T-ST(TOP) ; 

2.  FLAG  :=  true; 

comment  initialize  stacks; 

3.  TOP  :=  0;  S-ST (TOP)  :=  0;  T-ST(TOP)  :=  0; 

comment  cneck  next  position  and  if  necessary  cycle; 

4.  for  3  ;=  x  until  y  do 

5.  Degin  LENGTH (p ,j , j ) ; 

6.  if  E ( 3 , 3 )  +1(3)  >  LEN  then 

7.  begin  comment  tnere  exists  r  such  that 

Lp (r  ,  3  )  >  Lp ( u  ,  v)  ; 


8  . 
9  . 

10. 

11. 

12. 

13. 

14. 

15. 

16. 

17. 

18. 

19. 

20. 
21. 
22. 

23. 

24. 
2  5. 
2  6. 

27. 

28 . 
29. 


FLAG  :=  false;  i  :=  3  -  1; 
while  not (FLAG)  and  i  2  x  do 

begin  if  i  =  T-ST(TOP)  then 

begin  SORT (S-ST (TOP) ,T-ST(TOP) ) ; 
LENGTH (p, S-ST (TOP) ,T-ST(TOP) ) ; 
comment  pop  stack;  TOP  :=  TOP  -  1; 
end ; 

CYCLE ( i / 3 ) ;  LENGTH (p , i , 3 ) ; 
if  CHECK (p, 1,3)  then 

begin  comment  push  stack; 

TOP : =TOP+l ; 

S-ST ( TOP) :=i;  T-ST(TOP) :=j ; 

FLAG  :=  true; 
end  else 

beg  in  SORT ( i , 3 )  ;  LENGTH ( p , i , j )  ; 
i  :  =  i—  1 ; 
end ; 

end; 

end ; 
end ;  « 

return  FLAG; 
end ; 


PROCEDURE  7 


' 
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O'  C  r. 


1  . 


2  . 


3. 
4  . 


8 . 
9  . 

10. 

11. 

12. 


13. 

14. 

15. 

16. 
17. 


Procedure  ALSORT (p  ,m ,  u ,  v) ; 

begin  comment  global  variables  are  explained  in 
Procedure  5. 

p,  m,  u,  and  v  have  the  same  significance  as  in 
the  text.  LEN  is  the  length  of  tne  modified 
permutation  at  exit; 


:  Zi=lAp(i)  +  Zi=ubp(i)  +Zi=vCp(i); 


comment  check  ana  if  necessary  cycle  in  positions  1 
to  m— 1; 

HIGH  :=  u-1;  if  m  <  u  then  HIGH  :=  m-l ; 
if  HIGH  >  0  then 

begin  LOW  :=  1; 

if  not (ChECKCYCLE (p , LOW ,HIGh) )  then  goto  NGGOGD; 
ana; 

comment  cneck  positions  m  to  u  (or  just  u) ; 

LOW  :=  m;  if  m  >  u  then  LOW  :=  u; 

it  not (CHECK (p, LOW ,u) )  then  goto  NGGOOD; 

comment  cneck  (ana  cycle)  in  positions  u+i  to  v-1; 

if  v-u  >  1  then 

if  not  (CHECKCYCLE (p , u+1 ,v-l)  )  then  goto  wOe>GGD ; 
comment  check  position  v; 
if  not (CHECK (p ,v ,v) )  then  goto  NGGOGD; 
comment  check  (and  cycle)  in  positions  v+1  to  n; 
if  v  <  n  then 

if  not (CHECKCYCLE (p , v+1 ,n) )  then  goto  NGGOGD ; 
comment  permutation  found; 

LEN  :=  E(3,n) ;  goto  DONE; 

comment  failed  to  obtain  permutation,  abort; 

NGGOGD:  LENGTH (p , 1 ,n) ;  LEN  :=  E(3,n); 

DONE:  end; 


PROCEDURE  8 
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version  ot  Lemma  4.5  is  used. 

Lemma  4.9:  Let  p  oe  an  almost  sorted  optimal 
permutation  tor  an  LMS  flow  shop  with  the  positions  s,...,t 
cycled.  Let  q  be  a  permutation  such  that  q(i)  =  p(i),  l£i<s, 
jobs  p  ( s)  ,  .  .  .  ,p  ( t )  are  rearranged  as 

p (s) , . .  ,p  (k-1)  , p(t) ,p(k) , . . ,p (t-1)  in  positions  s,...,t  ot  q 
and  the  remaining  jobs  are  in  any  oroer  in  q.  Then,  the 
following  hold. 

(1)  LQ(x,i)  <  L  ,  l£i£k,  lix^i. 

(2)  There  exists  i,  k<i£t,  such  that  L  (x,i)  >  L  for  some 

q  p 

x,  l£x£i. 

Proof :  First  assume  that  jobs  p ( t+1)  , .  .  . ,p (n)  are  in 
tne  same  oraer  in  q  as  in  p. 

Note  tnat  by  previous  lemmas  (the  corollary  to  Lemma  4.3  in 
particular),  p  does  not  contain  any  unnecessary  cycled 
positions.  Hence,  q  must  be  sub-optimal  since  in  q  (s,k) 
rather  than  (s,t)  is  cycled.  Now,  the  only  positions  in  q 
which  differ  from  those  of  p  are  in  positions  k  to  t,  thus: 

p'  Bp(k)' . ,Bp(t-I)  ,Ep(t) 

q:  “p(t)  ,bp(k)  . . Bp(t-i) 

If  there  is  any  position  i<k  such  that  L  (x,i)  >  L>f  then 

9  P 

the  change  from  q  to  p  does  not  affect  the  value  of  L,(x,i) 
ano  thus  L^(x,i)  =  Lp(x,i)  >  L_ ,  a  contradiction.  Similarly, 
suppose  L^(x,k)  >  Lp  for  some  x.  Since  >  bp(K)' 

fap(K)  1  Cp  (k)  '  ana  Bp(i)  i  Cp  ( i)  '  K<i<t'  it:  that 

L  (x,k)  £  Lp  (x ,  t)  .  Therefore,  Lp  <  Lq(x,k)  L  L  (x,t),  again 


L 
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a  contradiction.  Hence,  part  (1)  follows. 


How  since  q  is  not  optimal  there  exists  some  i>k  such 
that  L^(x,i)  >  for  some  x,  lixii.  But  suppose  that  tnere 
is  no  such  lit.  Since  the  only  difference  Detween  q  and  p 
occur  in  position  (k,...,t),  x  must  satisfy  kixit,  for 


otherwise  L  (x,i)  =  L  (x,i).  Now,  for  kixit  we  nave 
ir'  Si 

Lq  (x ,  i)  *  Lp(t,i)  since  Bpd)  *  Ap(i)'  x^i<t,  and  if  x>k, 

Bq(x)  “  Bp(x-1)  <  bp(t)'  anQ  bp(t)  ^  Ap(t)*  BUt 

L  (x,i)  i  L  (t,i)  implies  that  L  (t,i)  >  L  ,  a 
q  p  P  p 

contradiction.  Hence,  there  exists  i,  k<iit,  such  that 
L^(x,i)  >  L  ,  for  some  x,  lixii.  Thus  part  (2)  is  proved. 


It  is  clear  that  if  the  jobs  in  positions  t+l,...,n  in 

q  are  not  in  the  same  sequence  as  in  p,  the  above 

conclusions  still  hold  since  the  order  of  these  jobs  does 

not  affect  tne  values  L  (x,i)  for  lit.  O 

Si 


Lemma  4 . 10 ;  Procedure  ALSORT  will  perform  an  optimal 
set  of  cycle  operations  when  the  set  of  decisions,  I,  II, 
III,  is  optimal.  The  procedure  requires  at  most  G(n  )  time. 


Proof:  Suppose  an  optimal  set  of  decisions  I,  II,  III 

has  been  made.  Then  there  exists  an  optimal  almost  sorted 

permutation,  p,  with  L  =  L  (u,v)  wnere  p(u)  and  p(v)  are 

P  P 

the  jobs  chosen  in  II  and  I  and  the  jobs  . ) f  miiiu,  are 
the  jobs  chosen  in  III,  wnich  are  less  than  B  ,  ,.  Now,  p 

r  \  / 

differs  from  the  permutation  passed  to  ALSORT  only  in  that 
the  cycle  operation  has  Been  performed  once  on  a  disjoint 


. 
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set  of  index  pairs  { (s (i)  ,  t (i) ) ,  |  Ki£l}  as  previously 

notea.  It  is  sufficient  to  show  that  ALSORT  will  find  all 

the  required  positions  s(i),t(i)  ano  perform  the  correct 

cycle  operations,  thereby  deriving  p.  The  case  of  only  one 

cycle  is  straightforward.  Suppose  the  first  k-1  cycles  have 

been  found,  yielding  an  intermediate  permutation  q,  then  oy 

Lemma  4.9,  there  will  be  a  position,  i,  in  the  range  of  the 

k-th  cycle  for  which  L  (x,i)  >  L^  =  LEN,  for  some  x,  l^x^i. 

q  p 

Since  this  is  precisely  the  condition  that  ALSORT  tests,  the 
procedure  will  seek  for  the  index  s(k)  and  subsequently 
cycle  (s  (k) , i)  in  order  to  remove  the  condition. 

Notice  that  even  if  the  true  cycle  is  on  (s(k)  ,t(k)  ) 
while  the  algorithm  initially  tries  to  cycle ( s ( k ), i) , 
s(k)<i<t(k),  then  part  (1)  of  Lemma  4.9  ensures  that  a 
temporary  solution  consisting  of  cycle  ( s  ( k ), l)  exists  and 
part  (2)  ensures  that  the  procedure  will  subsequently  seek 
to  cycle  (s  ( i)  , t ( i) )  since  any  partial  cycling  does  not 
entirely  remove  the  condition  from  the  positions 
s  (k)  ,  .  . .  ,  t  (k)  . 

The  complexity  of  ALSORT  is  dominated  by  the  calls  to 

the  CHECK  and  CHECKCYCLE  procedures.  Since  only  one  of  these 

is  called  for  each  3,  l£j£n,  ano  since  CHECK  is  linear  and 

2 

CHECKCYCLE  requires  at  most  0(n  )  time,  it  follows  that 

3 

procedure  ALSORT  takes  at  most  0(n  )  time.  0 


The  main  algorithm  is  presented  in  Procedure  9. 


■  - 


t 
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Algorithm  LMS-FLOW; 

1.  oegin  comment  p  is  the  permutation  under  test,  OPT PERM 
is  the  optimal  permutation  after  running  the 
algorithm.  LEN  and  OPTLEN  are  the  lengths  of  p  and 
OPTPERM  respectively.  PSAVE  is  useo  to  save  the 
permutation  p  Defore  decision  III  is  made  so  that 
it  can  oe  restored  for  alternative  choices.  Other 
variaoles  are  explained  in  Procedure  5.  It  is 
assumed  that  the  initial  pre-sorting  in  descending 
order  of  within  has  been  done; 


2  . 

3  . 

4  . 

5. 

6  . 
7. 
8  . 


for  i  : =  0  until  3  ao  E(i,0)  :=  0;  Y(n)  :=  0; 
for  3  :=  0  until  n  ao  E(0,3)  :  =  0; 
comment  initialize  permutation  to  sorted  order; 
for  i  :=  1  until  n  do  p(i)  :=  i; 

comment  obtain  an  upper  bound  for  schedule  length; 
LENGTH (p, 1, n) ;  OPTLEN  :=  E(3,n);  OPTPERM  :=  p; 
comment  try  all  combinations  of  decisions  I,  II,  III; 
for  i  :=  1  until  n  ao 

begin  comment  index  i  makes  decision  I; 

£ina  v  1  i  such  that  Bp(x)  2  Cp(y)  >  Bp(y), 


9  . 

CYCLE (i,v);  LENGTH (p , i , v) ; 

10. 

for 

k  :=  1  until  n  do  psave(k)  :=  p(k); 

11. 

for 

3  :=  1  until  v  do 

12. 

begin  comment  index  3  makes  decision  II; 

13. 

for  u  :=  3  until  v  do 

14. 

begin  comment  index  u  makes  decision  III; 

15  . 

if  3  <  v  then 

16. 

begin  if  u  <  v  then 

17. 

begin  CYCLE(3,u);  LENGTH (p , j , u) ; 

18. 

m  :=  j;  if  m  =  u  then  m  :=  m+1; 

19. 

end  else  goto  NEXT; 

20. 

end  else  m  :=  v+1; 

21. 

ALSORT (p,m, u,v) ; 

22. 

if  LEN  <  OPTLEN  then 

2  3 . 

begin  OPTLEN  :=  LEN; 

24. 

for  k : =1  until  n  do  OPTPERM (k) :=p(k) ; 

2  5 . 

end ; 

comment  restore  p  for  next  choice  III; 

26  . 

for  k  :=  1  until  n  do  p(k)  :=  psave(k); 

27  . 

LENGTH (p, l,n) ; 

28 . 

end ; 

29 . 

NEXT:  end; 

comment  restore  sorted  input  permutation; 

30. 

tor 

k  :=  1  until  n  ao  p(k)  :=  k; 

31. 

end 

• 

/ 

32. 

end ; 

PROCEDURE  9 

. 
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Theorem  4.7:  Algorithm  LMS-FLOW  linos  an  optimal 

£ 

permutation  lor  an  LMS  How  snop  in  time  at  most  0(n  )  . 

Pr ool :  The  algorithm  tries  all  combinations  ol  choices 
I ,  II  and  III,  and  must  at  some  stage  have  an  optimal 
decision  set.  But  by  Lemma  4.10  procedure  ALSORT  produces  an 
optimal  permutation  whenever  the  decision  set  is  optimal, 
hence,  this  algorithm  will  lino  at  least  one  optimal 
permutation. 

3  . 

The  three  choices  produce  at  most  n  initial 
permutations  to  be  mooilieo  by  procedure  ALSORT  which  taxes 

3 

only  G(n  )  time.  Hence,  an  optimal  permutation  will  be  louno 
in  0 ( n b )  t ime .  0 

Rote  that  the  algorithm  trivially  requires  only  O(n) 
space . 

4.3.2  SML  Flow  Shops 

Algorithm  LMS-FLOW  can  be  used  to  schedule  an  SML  flow 
shop  as  follows.  Given  an  SML  flow  shop  with  jobs  (A^  H, 

£  ),  HHn,  with  A^  £  Z  H,  apply  algorithm  LMS-FLOW  to 
the  flow  shop  with  Ai  =  £n+1_if  Bi  =  Bn+1_i,  Ci  =  A^-.^, 
lHln,  to  get  an  optimal  permutation  p  =  (p(l),  ...  ,p(n)). 

An  optimal  permutation  for  the  SML  flow  shop  is 
p=  ( p  ( n )  ,  p  (n-i)  ,  ...  p(l)). 

Tneorem  4.8:  The  above  application  of  algorithm 


' 
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LMS-FLGW  proauces  an  optimal  schedule  for  an  SML  flow  shop 
in  time  0(n  ) . 

Proof :  An  SML  problem  can  be  regarded  as  one  of 
minimizing  the  length  of  a  scneoule  built  from  the  end 
backwaras  with  eacn  30b  executing  first  on  processor  3,  ana 
tnen  on  processor  2  and  finally  on  processor  1.  From  this 
point  of  view,  an  SML  flow  shop  problem  becomes  an  LMS  flow 
shop  prooiem  ana  this  is  precisely  the  approach  taken. 

The  preconversion  Defore  application  of  algorithm 
LMS-FLGW  as  well  as  the  rinal  conversion  of  p  to  p  are 
linear.  Hence,  algorithm  an  optimal  is  found  in  at  most 
G(n^)  time. 

4.4  Discussion 

The  foregoing  results  on  some  special  structure 
three-stage  flow  shops  are  summarized  in  Figure  21.  In  the 
Venn  diagram,  eacn  circle  represents  a  3-minimal  or 
3-maximal  flow  shop  (1^3^3) .  Their  intersections  leaning  to 
tne  LMS  ana  the  other  types  of  flow  shop  considerea  and  the 
associated  complexities  are  indicated.  Although  the 
aigoritnm  for  the  LMS  (and  SML)  flow  shops  is  polynomial  in 
n,  its  running  time  of  0(n  )  makes  it  highly  impractical. 
Improving  the  given  algorithm  or  aevising  a  faster  one 
possioly  with  a  completely  different  approach  remains  an 


open  problem. 
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1-MFIX 


F  I  GURE  2  1  : 

Relationships  and  complexities 
of  ih©  special  3"st age  flow  shops 


Chapter  Five 


OPEN  SHOP  SCHEDULES 

Another  processor  bound  system  ot  interest  is  the  open 
shop  which  is  closely  related  to  the  flow  shop.  The  only 
difference  between  them  is  that  in  an  open  shop,  no 
retrictions  are  placed  on  tne  order  in  whicn  the  tasks  or 
any  30b  are  to  be  done.  Thus,  an  open  shop  consists  of  mil 
processors,  P^,  l£i£ni,  and  nil  jobs  where  each  30b  has  m 
component  tasks  and  the  3-th  task  of  the  i-th  30b  is  to  be 
performed  on  the  3-th  processor  for  any  i.  This  chapter 
deals  mainly  with  the  problem  of  finding  optimal 
non-preempt ive  scneoules  for  the  flow  shop  with  the  mean 
flow  time  performance  measure. 

5.1  Survey 

The  problem  of  finding  minimal  length  scneduies  tor  the 
open  snop  has  been  studied  by  Gonzalez  ano  Sahni  (1976)  who 
provide  some  motivation  tor  interest  in  this  particuxar 
model  and  Gonzalez  (1976) .  Gonzalez  ano  Sahni  presented  a 
linear  algorithm  for  the  2-processor  preemptive  and 
non-pr eemp tive  systems,  and  for  m  2  3  gave  an  efficient 
algorithm  for  preemptive  schedules,  while  the  non-preemp tive 
problem  was  shown  to  be  NP-complete.  Gonzalez  subsequently 
presented  a  taster  algoritnm  for  the  preemptive  case  when 
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in  2  3 . 

As  lor  the  mean  liow  time  minimization  proDlem, 

Gonzalez  (1979)  showed  NP-completeness  for  the  case  of  an 
aroitrary  number  of  processors.  For  the  case  of  only  one 
processor,  a  well-known  solution,  reported  by  Smith  (1956) 
is  to  schedule  the  tasks  in  order  of  non-decreasing 
execution  time. 

In  the  following,  the  mean  flow  time  problem  is  shown 
to  be  NP-complete  for  two  processors.  Subsequently,  in 
Section  5.3,  tight  bounds  on  the  mean  flow  time  of  an 
arbitrary  schedule  and  an  SPT  schedule  are  obtained  in  terms 
of  the  optimal  mean  flow  time. 

5.2  Complexity  of  Mean  Flow  Schedules 

The  2-processor  mean  flow  problem  is  shown  NP-complete 
by  a  reduction  from  3-PARTITION  (Section  1.2.2).  The  open 
snop  to  be  constructed  consists  of  a  large  number  of  jobs 
but  tneir  number  and  the  sum  of  their  lengths  will  be 
polynomial  in  nK.  There  are  tour  types  of  jobs,  the  T,  U,  X 
ano  Y-jobs.  U-jobs  are  further  divided  into  V  and  w-jobs. 

The  X  and  Y-type  jobs  are  large  jobs  which  must  be  tne  last 
jobs  in  the  schedule  in  order  to  keep  the  mean  flow  small. 
Furthermore,  their  execution  must  not  be  delayed  beyond 
certain  specific  times  if  tne  bound  D  is  not  to  be  exceeded. 
The  T-jobs,  wnicn  have  very  large  tasks  on  processor  1 
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compared  to  the  T  and  U-tasks  on  processor  2,  are  used  to 
generate  fixed-size  gaps  on  the  second  processor  which  can 
only  he  fiilea  by  the  U-jobs  if  and  only  if  the  given 
instance  of  3-PARTITION  has  a  solution. 

For  convenience  in  specifying  the  different  types  of 
30bs,  a  30b  will  be  given  as  an  ordered  pair  (x,y)  where  x 
is  the  time  taken  by  the  30b  on  processor  1  and  y  is  the 
time  taxen  on  processor  2.  The  notation  S(X) ,  F(X) 
introduced  in  Chapter  1  for  the  starting  and  finisning  time 
of  a  tasx  (or  30b),  X,  will  be  adnered  to. 

Tneorem  5.1:  The  2-processor  open  shop  non-pr eemp t ive 
mean  flow  scheduling  problem  is  RP-comple te . 

Proof:  Given  a  3-PARTITIGN  problem  witn  n,  K  ano  the 

set  {a,... a,.  },  consider  the  following  2-processor  flow 
1  3n 


shop : 


. 
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T—  3 obs  :  T 


0 


T  A  =  ( t ,  g )  ,  1  1  i  £  n  ; 


=  (0,  g) ? 

1  £  i  £  h ; 

l£i£n; 

1  £  i  £  n,  1  £  j  £  (u-3) 
=  (0,  v+a^ ) ,  1  £  i  i  3n; 
wnere  u  =  3nK  +  3n  +  1 ,  g  =  3nK  +  3n  +  6 , 

v  =  n  (K  +  1)  g ,  t  =  uv  +  g+  K, 

=  2  (n  +  h)  t ,  y  =  x, 

=  (n  +  1 ) g  +  3nK  +  nug  +  n(n  -  l)u(g  +  K)/2, 


X-jobs:  Xi  =  (x,  0) , 
Y- j obs :  Yi  =  ( 0 ,  y)  , 

tJ — j  obs  :  V  .  =  ( 0 ,  v)  , 

1  /  3 


ana  \n  . 
1 


x 

s 


ana 


h  =  s  +  1. 


The  Dound  D  is  given  as  D  =  X  +Y  +  T  +  U  ,  wnere 

e  e  e  e 


X 


=  ZJJ  =  1  (nt  +  ix)  ; 


-h 


T 


=  Zi=1  (nt  +  g  +  iy)  ; 
=  £^  =  0( 9  +  it)  ; 


and 


Ue  =  3nK  +  Z^=0(Z^=1(g  +  it  +  jv)  )  . 


Suppose  the  3-PARTITION  problem  has  a  solution.  Use  the 
schedule  suggested  in  Figure  22.  Schedule,  on  the  first 
processor,  tasks  TV[1],  l£i£n,  followed  by  X^ti],  l^i^h,  and 
on  the  second  processor,  tasks  TC[2],  0£i£n,  followed  by 
¥^[2J,  l£i£h.  This  yields  the  template  of  Figure  22.  In  each 
of  the  n  natchea  areas  on  the  second  processor  place  (u-3) 
V-type  tasks  followea  by  the  three  w-tasks  cor responaing  to 
the  three  elements  of  one  of  the  3-element  partitions. 
Clearly,  the  X,  Y  and  T-jobs  contrioute  Xg ,  Y  ana  Tg  to  the 
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mean  flow  time.  Since  there  are  exactly  three  W-type  tasks 
with  sum  (3v+K)  in  each  hatched  area,  the  contrioution  of 
the  i-th  hatched  area,  0£i£(n-l),  is  less  tnan 
(3K  +  +  it  +  jv)  )  .  (Note  tnat  the  three  w-tasks  will 

be  the  last  to  be  executed  in  eacn  hatched  area.)  Hence  the 
U-jods'  contrioution  to  the  mean  flow  time  is  less  than  tj  , 
and  the  hound  D  is  not  exceeded  by  the  mean  flow  time  of  the 
schedule . 

Now,  suppose  there  exists  a  scnedule  with  mean  flow 
time  not  exceeding  the  bound  D.  For  brevity  such  a  schedule 
will  be  referred  to  as  a  good  schedule.  The  proof  of  the 
theorem  is  completed  by  showing  that  the  desired  3-element 
partitions  must  exist.  This,  in  turn,  is  done  by  showing 
that  if  tnere  is  a  good  schedule,  then  it  can  be  reduced  in 
polynomial  time  (i.e.  polynomial  in  nK)  to  a  good  schedule 
witn  tne  structure  of  Figure  22.  The  following  intermediate 
results  (lemmas)  are  required  for  this  purpose. 

5.1  In  a  good  scnedule,  S(X^[lj)  2  nt  and 

S(Y^L2j)  2  (nt  +  g),  l^i^h.  Hence,  all  T[l]-tasks  are 
executed  before  any  X [ 1 J -tasks  and  all  T[2]  and 
U  L  2] -tasks  are  executed  before  any  Y[2J-task.  without 
loss  of  generality,  it  may  be  assumed  that  the  X[l], 
Y[2J  and  T-tasks  are  executed  in  increasing  order  of 
their  index  i. 

5.2  In  a  good  scneoule,  S (X^  LI])  =  nt  and 
S(Y1 [2] )  =  nt  +  g . 


133 


In  addition,  f?‘  ( T  ±  [  1  ]  )  =  ti,  0£i£n. 

5.3  In  a  good  schedule,  curing  the  first  (nt  +  g)  time 
units,  at  least  u  tasKs  are  executed  on  processor  2  in 
any  time  interval  of  length  (t  -  g) . 

5.4  A  gooo  schedule  can  be  reduced  in  polynomial  time  to 
one  in  which  P (T± [1] )  i  S ( T± [ 2J ) ,  0£i£n,  and 

F(T  [2]  )  <;  F(T  11]  )  ,  0<li<n . 

1  1  T  1 

5.5  A  gooo  schedule  can  be  reduced  in  polynomial  time  to 
one  in  which  S  ( T  ±  i  2  ]  )  -  Ffl^l])  <  v  +  K/2,  0ii^n. 

5.6  In  a  good  schedule  satisfying  ail  previous  lemmas,  at 
most  (u  +  1)  ana  at  least  (u  -  1)  U-tasks  are  executed 
between  1  ^  12]  ana  riv+1  [  2]  ,  0£i<n. 

5.7  In  a  good  schedule  satisfying  all  previous  lemmas, 
EXACTLY  u  U-tasks  are  executed  between  1\[2]  and 
Ti+1[2]  ,  0£  i<n . 

5.8  In  a  good  schedule  satisfying  all  previous  lemmas, 

S  (TL  L 2]  )  =  ti,  0iiin. 

Lemmas  5.1  ano  5.2  establish  the  tact  that  the 
T[l] -tasKs  are  done  in  the  first  nt  time  units  ano  are 
immediately  followed  by  the  X  L 1 ] -tasks.  In  addition,  the 
Y[2] -tasks  are  the  final  tasKs  to  be  done  on  processor  2  ano 
they  must  start  precisely  at  time  (nt  +  g),  thus  leaving  no 
idle  time  on  that  processor. 

The  main  conclusions  of  Lemmas  5.3  to  5.8  are  given  in 
5.7  and  5.8  (the  others  are  required  merely  as  intermediate 
steps  in  the  proofs  of  these).  By  Lemma  5.8,  S(T^[2])  =  ti. 


. 
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01iln.  At  this  point,  one  obtains  a  good  schedule  which  is 
similar  in  structure  to  that  of  Figure  22.  Consider  a  good 
schedule  which  satisfies  the  lemmas.  Then  3-elements 
partitions,  H  ^ ,  lliln,  are  given  by 

Hi  =  |  S(T._1[2])  <  S (WR [2] )  <  S (T± [2] ) i . 

By  Lemma  5.7,  there  are  u  U-tasks,  U ±  [ 2 ]  ,  with 
S  (1  j__  ^  [  2  j  )  <  S(Lk[2])  <  S(T^[2]).  Since  the  V[2]-type  tasks 
nave  length  v  ana  the  W[2] -tasks  have  length  less  than 
(v  +  K/2)  ana  greater  tnan  (v  +  K/4)  and  these  u  tasks  cover 
an  interval  of  t  -  g  =  uv  +  K  without  leaving  an  iale 
period,  the  u  tasKS  must  contain  exactly  tnree  w [ 2 j -tasks 
witn  total  length  3v  +  K. 

Hence,  the  open  shop  nas  a  schedule  with  mean  flow  time 
not  exceeding  the  aeaaline  D  if  ana  only  if  the  3-PARTITION 
problem  has  a  solution. 

The  proofs  of  the  lemmas  follow.  The  effects  of  the 
lemmas  are  taken  to  be  cumulative.  In  other  words,  wnen 
proving  any  lemma  it  is  assumed  that  a  good  schedule  which 
satisfies  the  previously  proven  lemmas  is  available. 

Lemma  5 . 1 :  In  a  good  schedule,  S(X^[lj)  2  nt,  and 
S(Y^[2J)  2  (nt+g),  11 ilh .  Hence,  all  T [ 1 j -tasks  are  executed 
Delore  any  x 11] -task,  ana  all  T[2]  ana  U[ 2] -tasks  are 
executed  Deiore  any  Y[2]-task.  Vvithout  loss  of  generality, 
one  may  assume  that  tne  X [ 1 ] ,  Y[2]  ana  T-tasks  are  executed 
in  increasing  oraer  of  their  index  i. 


. 
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Proof :  It  can  be  assumed  that  all  zero-length  tasks  are 
executed  at  time  0.  First,  suppose  that  only  the  first 
inequality  is  violated,  i.e.  S(Y^[2])  2  (nt  +  g)  for  all  i 
but  S ( X  [ 1 ] )  <  nt  for  some  3.  Suppose  further  that  only  one 
X-task,  Xj [1] ,  starts  oefore  time  nt.  Now,  compute  a  lower 
oound  for  the  mean  flow  of  the  schedule.  The  Y-jobs 
contribute  at  least  Y  .  Since  x  =  2(n  +  h)t  is  greater  than 
(nt  +  g)  ano  no  Y-task  is  executing  on  processor  2  before 
time  (nt  +  g) ,  all  other  processor  2  tasks  can  be  finished 
oefore  F ( X ^ [ 1 ] ) .  Hence,  the  remaining  tasks  on  processor  1 
alter  time  F ( X  [ 1 j )  are  completely  independent  of  tasks 
remaining  on  processor  2  ano  are  therefore  scneouleo 
optimally  m  non-oecreasing  order  of  tneir  execution  time, 
hence,  the  x-joos  contribute  at  least 

X  =  x  +  *(x  +  nt  +  ix)  =  X  -  nt. 

b  i  =  I  e 

Now,  since  S ( X ^ [ 1 ] )  <  nt,  at  least  one  of  the  tasks  T ^ [ 1 J 
must  execute  after  task  X^ [ 1 ] .  Hence  the  T-jobs  (consider 
only  the  T[l]-tasks)  contrioute  at  least 

T.  =  x  +  Z*?  ,  (ti)  =  T  +  x  -  (n  +  l)g. 
b  1=1  e 

Finally,  the  contribution  of  the  0-3obs  to  total  mean  flow 
time  may  oe  bounced  as  follows.  Since  eacn  U-task  on  the 
second  processor  is  at  least  v  in  length,  a  trivial  lower 
oouno  for  their  flow  is 


ub  = 


=  U 

=  u 


.nu  ,  .  > 

i=i(lv) 

= 

Zn  1 ( Zu 

^i  =  0  j  =1 

-  3nK 
e 

+ 

^n-l  ru 

^*1  =  0  1  3=1 

-  3nK 
e 

+ 

_n- 1 u 
2“i=0  '3  =1 
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=  Ug  -  3nK  -  nug  -  n(n  -  l)u(g  +  K)/2 
=  Ue  -  s  +  (n  +  1)  g . 

Adding  these  values  together  gives  a  lower  oouna  of 


+  X,  +  T, 
b  b 

+  °b 

=  Y^  + 

e  e 

+  Te 

+  U 

=  D  +  x  - 

nt  - 

s  . 

Since  x  is  strictly  greater  than  (nt  +  s)  the  schedule 
cannot  be  a  good  one. 


how  suppose  that  only  the  second  inequality  in  tne 
lemma  is  violated.  Assume  S(Y^[2])  <  nt  +  g  for  some  i.  By 
similar  reasoning,  one  obtains  a  lower  bound  of 
Yd  =  (Y0  -  nt  -  g)  for  the  Y-jods,  Xg  for  the  X-jobs, 

(Tg  -  (n+i)g)  for  the  T-jobs  ano  (U^  +  y  -  nuv )  tor  the 
u-jods,  if  a  U[2]-task  executes  after  Y^[2] .  Alternatively, 
one  gets  a  lower  bound  of  +  Y^  +  +  U  if  a  T'[2]-task 

executes  after  Y^[2J.  In  either  case,  y  =  x  is  large  enougn 
to  ensure  that  tne  lower  bound  is  strictly  larger  than  D. 

Finally,  consider  the  mixed  case  where  some  X^[l] 
starts  before  nt  and  some  Yj<.[2]  starts  before  (nt  +  g)  .  It 
follows  that  some  T[l]-task  will  execute  after  X ^ [ 1 ]  and 
either  a  T[2]-task  or  a  U [2] -task  will  execute  after  Y  [2]. 

K 

If  a  U[2]-task  roilows  Y  ^  [  2  ]  <-hen  lower  bounds  of  x^ ,  Y^, 
and  (1)^  +  y  -  nuv)  are  obtained  in  the  same  manner  as  above. 
Again,  the  total  lower  bound  is  too  large.  The  only 
remaining  case  is  when  a  T[l]-task  that  follows  X^[l] 
belongs  to  tne  same  30b  as  a  T[2]-task  that  follows  Y ,12]. 

K 
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In  this  case  one  ootains  a  lower  bound  of 

Xt>  +  1  +  Td  +  Ub  =  D  +  X  -  2nt  -  s  -  9  >  D. 

Since  the  X^[l]  ,  Y.  [2]  have  the  same  length  it  can  oe 
assumed  that  they  are  executed  in  oraer  of  increasing 
index  i.  Similarly,  since  the  T^-jobs  for  lliln  have  the 
same  length  tasks  on  each  processor  it  can  also  oe  assumed 
that  the  order  of  execution  on  the  first  processor  is  in 
increasing  index  i.  [| 

Lemma  5.2:  In  a  good  schedule,  S(X^[1])  =  nt  and 
S(Y^[2])  =  nt  +  g.  In  addition,  F(T^[1])  =  ti,  01iln. 

Proof ;  Note  that  X^[l]  and  Y^[2]  are  now  the  first  X 
and  Y-tasks  to  De  executed  among  the  X  ana  Y  jobs  on  the 
first  and  second  processors  respectively. 

by  Lemma  5.1,  S(X  [1])  2  nt  ana  S(Y^[2])  2  nt  +  g. 
Suppose  S (X  [1] )  >  nt.  Then  the  flow  tor  tne  X  jobs  is  at 
least  El  , (nt  +  1  +  jx)  =  X  +  h.  The  following  trivial 

j  —  1  e 

lower  bounds  for  the  other  jobs  are  easily  obtained;  Y  for 
the  Y-jods,  (T  -  (n  +  l)g)  for  the  T-jobs  (obtained  by 
considering  only  T [ 1] -tasks),  and  Ub  for  the  u-jobs.  Adding 
these  together  leads  to  a  lower  bound  of 

(D  +  h  _  S)  =  (d  +  1)  for  the  scnedule,  which  contradicts 
the  fact  that  it  is  a  good  schedule.  The  case 
S  ( Y'x  [  2]  )  >  nt  +  g  is  similar. 

Thus,  it  follows  directly  that  there  is  no  idle  period 
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on  the  first  processor  for  the  first  nt  time  units  and  on 
the  second  processor  for  the  first  (nt  +  9)  time  units. 

Tnus,  F(Ti[l] )  =  ti,  0£i£n.  0 

Lemma  5.3:  In  a  good  schedule,  during  the  first 
(nt  +  g)  time  units,  at  least  u  tasks  are  executed  on 
processor  2  in  any  time  interval  or  length  (t  -  g). 

Proof ;  Recall  that  there  is  no  idle  time  on  processor  2 

during  the  first  (nt  +  g)  time  units.  The  number  of  tasks 
executed  in  the  interval  is  at  least  J  =  (t  -  g)  /  (v  +  K/2) 
since  every  task  executed  before  (nt  +  g)  on  processor  2  has 
length  not  exceeding  (v  +  K/2) . 

how  v=n(K+l)g  >  nKg  >  nKu  >  uK/2 

=>  K  >  uK/2  -  v  -  K/2 

=  >  uv  +  K  >  (u  -  1)  (v  +  K/2) 

or  t  -  g  >  (u  -  l)(v  +K/2) 

or  J  >  u  -  1 .  i 

In  an  open  shop,  there  is  no  restriction  on  the  order 
in  which  a  job's  tasks  are  perrormed.  It  has  been  assumed 
that  zero  length  tasks  are  executed  at  time  0.  However,  the 
T-jobs  have  non-zero  tasks  on  both  processors.  The  following 
lemma  shows  that  there  exists  a  good  schedule  which  executes 
them  in  the  same  processor  order,  first  on  processor  1  and 
then  on  processor  2. 

Lemma  5.4:  A  good  scnedule  can  be  reduced  in  polynomial 
time  to  one  in  which  Fd^l])  <:  S  (T±  [  2]  )  ,  01i£n  and 
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F ( T t [ 2 ] )  i  F(Ti+1[l])f  i<in  -  1. 

Prool:  Tne  proof  is  by  construction.  Recall  that  as  a 
consequence  of  Lemmas  5.1  ana  5.2  there  is  a  good  schedule 
in  whicn  F ( T ^ [ 1 ] )  =  ti,  0Li£n. 

Consiaer  first  tne  T-jobs  wnose  T[l] -tasks  are  executed 
before  their  T[2]-tasKs.  Scan  the  scneouie  from  left  to 
right  looking  for  tasks  T\[2]  such  that 
S(T^[2J)  2  F(T^[1J)  =  ti.  It  must  be  ensured  that 
F(T^[2J)  ^  t(i+l).  If  this  is  not  the  case,  perform  the 
following  operation  as  often  as  necessary. 

Operation  I : 

Find  the  first  U-task,  Uc ,  on  processor  2  preceding 

T. 12] .  Remove  U  from  the  scneouie,  shift  the  tasks 
1  c 

following  ti  up  to  ana  including  T.  [2]  to  the  left  to 

C  1 

take  up  the  interval  vacated  by  U  ,  and  insert  U  after 

c  c 

rii  [2]  . 

It  must  be  shown  that  it  is  possiole  to  apply  the 
operation  until  F(T'iL2])  £  t(i+l)  ana  that  eacn  application 
aoes  not  increase  mean  flow  time.  Since 
F  ( T  ^  [  2  J  )  -  F ( T .  11])  >  t  and  the  total  length  of  the 

T [ 2 ] -tasks  is  (n+l)g  whicn  is  less  than  t  -  v  -  K/2,  the 
existence  of  a  U-task,  Uc ,  with  S(UC)  2  ti  and 
F(U  )  £  S ( T ■ [2] )  is  guaranteed.  Now,  any  T- [2] -task  Detween 

Cl  J 

u  and  T.  [2]  must  have  F(T.  [2])  <  S ( T ■  [  1 ] )  ,  so  that  shifting 

ci  J  J 

such  a  task  to  the  left  does  not  generate  any  conflicts  with 
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tasks  on  processor  1  and  does  not  affect  the  flow  time  of 
30b  T  .  Now,  the  flow  time  of  task  Uc  increases  by  at  most 
(n+l)g  while  that  of  task  Tv  [2]  (ana  hence  of  30b  Tv  ) 
decreases  by  at  least  v.  Since  v  =  n(K+l)g  is  much  greater 
than  (n+l)g,  tne  total  mean  flow  time  does  not  increase. 

In  the  secona  stage  consider  those  T-3 obs  whose 
T[2] -tasks  are  aone  betore  the  T [ 1] -tasks.  This  time  scan 
tne  scneaule  from  right  to  left  ana  perform  operation  II 
wnenever  a  T±  [  2  j  with  F  (T  £2]  )  <:  S(T.[1])  is  encountered. 

Operation  II: 

Find  a  V-task,  V  ,  such  that 

c 

(ti  +  g)  £  F(V  )  <  (ti  +  v  +  g) .  Remove  T. [2]  from  the 

c  1 

schedule,  shift  the  following  tasks  up  to  ana  including 
Vc  to  the  left  to  take  up  the  interval  vacated  by 
Tv  [2],  ana  insert  T^[2J  after  V'c .  If  among  the  shifted 
tasks  a  It  [2]  conflicts  with  T_ll],  exchange  the  Tv  [2] 
with  the  following  task. 

Again,  it  is  necessary  to  prove  the  existence  of  V  ana 
show  that  Operation  II  does  not  increase  mean  flow  time. 

Rote  tnat  any  T_^  [2]  whicn  executes  after  riv[2]  satisfies  the 
first  part  of  the  lemma  ana  must  also  satisfy  tne  secona 
part  (possibly,  after  an  earlier  application  of  II). 
Consequently,  there  is  no  task  of  length  g  in  the  time 
interval  [ti,t(i+l))  (see  Figure  23)  ana  at  most  one  task, 

TV  [2],  of  length  g  in  the  interval  [t ( i-1) , ti) .  The  task 
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or  length  g  in  [t(i-l) ,ti)  must  be  aone  oetore  other  U-tasks 

in  the  same  interval.  Thererore,  tnere  is  no  task  of  length 

g  in  [t i-v-K/2 , t ( i+i) )  (the  lower  limit  of  the  interval  over 

which  there  are  no  tasks  of  length  g  can  be  decreased  but 

the  above  is  sufficient  for  the  proof).  Now  the  interval 

[t i-v-K/2 , t ( i+1) )  has  length  t+v+K/2.  Even  if  ail  the 

Vv-tasks  were  in  this  interval,  their  total  length  is  less 

than  t  -  3v  and  nence  there  is  still  room  for  at  least  four 

V-tasKs  in  the  interval.  Since  the  V-tasks  are  smaller  than 

any  W-task,  they  will  be  the  first  to  be  aone  in  the 

interval.  Thus,  within  [ti-v-K/2,  ti+v+K/2)  there  are  only 

v-tasks  or  parts  of  V-tasks,  one  of  which  will  be  identifiea 

as  task  V  , . 

c 

Now,  consider  the  rirst  tasK  V_^[2]  to  finish  after  time 

ti.  By  above  arguments  this  tasK  and  the  one  rollowing  it 

must  be  V-tasKS.  v  L 2]  finishes  before  (ti  +  v  +  K/2)  (which 

is  less  than  (ti  +  v  +  g)  )  .  If  F(V^(2J)  -  ti  2  g,  then 

choose  V  =  V  [21 ,  otherwise  choose  the  tasK  following  it 
c  3 

as  V  .  Hence,  F(V  )  -  ti  <  v  +  g.  In  either  case, 

c  c 

ti  +  g  *  F(V  )  <  ti  +  v  +  g. 

The  operation  proceeds  to  remove  T.[2]  from  the 

schedule,  shift  the  following  tasks,  say  J  in  number,  up  to 

and  including  V  to  the  left  by  a  distance  of  g  and  insert 

c 

T . [ 2]  after  V  .  Since  F(V  )  2  ti  +  g,  after  performing  II 
i  c  c 

tnere  can  be  no  conflict  between  T^[2]  and  T^[l].  However, 

it  may  oe  necessary  to  resolve  some  conflicts  between  some 
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T^[2]  and  [1]  wnere  the  TV  [2]  is  among  the  tasks  shiftea 

left.  This  is  oone  oy  exchanging  the  T  [2]  with  the  task 

following  it.  (It  is  clear  that  the  following  task  is  a 

V-tasK  by  the  same  line  of  argument  that  was  usea  to 

determine  V  ) 
c 

Now,  since  any  intermediate  T L 2] -tasks  already  satisfy 
tne  second  part  of  the  lemma,  in  any  interval  LMj-l)  ,tj)  , 
where  F(T^[2])  <1  t(j-l)  before  II  was  applied,  there  can  be 
only  one  task  T ^ ^ [ 2 J  of  length  g.  By  Lemma  5.3,  the  same 
interval  must  contain  at  least  u  U-tasks  as  well.  Bence 
there  are  at  most  |_J  /  uj  T[2]  -tasks  among  the  shifted 
tasks . 

The  effects  of  Operation  II  on  tne  mean  flow  time  of 
the  scneoule  can  now  be  computed.  The  flow  time  of  30b  T . 
increases  by  at  most  (v  +  g) .  The  J  tasks  shifted  to  the 
left  lose  at  least  Jg  flow  time.  In  addition,  the 
intermediate  T[2J  -tasks  may  gain  at  most  |_J  /  ujv  in 
excnanges  to  resolve  conrlicts.  Tnus,  for  the  lemma  to  hold, 
it  must  be  shown  that  Jg  2  v  +  g  +  |  _J  /  u Jv ,  wnere  J  2  u. 
(Since  tne  J  tasks  cover  at  least  the  interval  [t(i-l),ti], 

J  must  be  no  less  than  u.)  Let  J  =  cu  +  d,  c  2  1  and 
0  £  d  <  u. 

Jg  2  v  +  g  +  |_J  /  ujv 

<  =  >  (cu  +  d)  g  2  v  +  g  +  cv 

<  =  >  3ncK  +  3nc  +  c  +  d  2  ncK  +  nc  +  nK  +  n  +  1,  c  2  1, 
which  is  easily  seen  to  be  true.  There  are  no  other  changes 
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to  the  mean  flow  time.  Hence,  a  good  schedule  can  be  reduced 
to  one  satisfying  the  lemma.  0 

Lemma  5.5:  A  good  schedule  can  oe  reduced  in  polynomial 
time  to  one  tor  which  S  ( T ±  [ 2]  )  -  F(Ti[l])  <  v  +  K/2,  0<a<in. 

Proof :  Consider  a  good  scneouie  satisfying  ail  previous 
lemmas.  S(T^[2])  -  F(T^LlJ)  £  t  -  g .  Furtnermore,  in  the 
interval  (ti,  ti  +  t)  tnere  is  no  task  of  size  g  on 
processor  2  other  than  T . [2] .  It 

l 

S(l\[2])  -  F(T\[1])  >  v  +  K/2,  then  the  task  preceding 
T^[2],  a  U[2J-task,  can  be  exchanged  with  T'^[2]  with  no 
increase  in  mean  flow  time,  and  no  conflict  in  the  execution 
of  jod  TL  since  that  task  must  start  after  ti  =  F(l\[l]).  0 

Lemma  5.6:  In  a  good  schedule  satisfying  all  previous 
lemmas,  at  most  (u  +  1)  and  at  least  (u  -  1)  U-tasks  are 
executed  Detween  T . [2]  and  T.,,[2],  0£i<n. 

Proof:  The  proot  is  similar  to  that  of  Lemma  5.3.  By 
Lemma  5.5,  the  interval  Detween  Tl[2]  and  T  ^  +  [  2  ]  is  no 
smaller  than  (t-g-v-K/2)  ana  no  larger  than  (t-g+v+K/2) . 

Suppose  there  are  fewer  than  (u  -  1)  L-tasks  in  the 
interval.  Their  total  lengtn  is  at  most  nK  +  (u  -  2)v. 

Now ,  v  =n(K  +  l)g  >  nK  -  K/2 

=>  uv  +  K/2  -  v  >  nK  +  uv  -  2v 

=  >  t  -  g  -  v  -  K/2  >  nK  +  (u  -  2)v. 


(by  definition  of  t) 


’ 
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Thus,  there  will  be  idle  time  on  processor  2  and  by  Lemma 
5.2  the  schedule  cannot  be  a  good  one. 

Similarly,  suppose  there  are  more  than  (u  +  1)  U-tasks. 
Their  total  length  is  at  least  (u  +  2)v. 


v  >  3K/2 


Bu  t 


=  >  uv  +  2v  >  uv  +  v  +  3K/2 

=>  (u  +  2)v  >  t  -  g  +  v  +  K/2.  (by  definition  of  t) 

hence,  at  most  (u  +  1)  U-tasks  can  be  present  in  the 

interval.  1] 

Tnis  condition  can  be  tignteneo  further  as  follows. 

Lemma  5 . 7 :  In  a  good  schedule  satisfying  all  previous 
lemmas,  EXACTLY  u  U-tasks  are  executed  Detween  T.[2]  and 

l 

T‘i+1  [  2]  ,  01 i<n . 

Proof :  To  Degin  with,  consider  in  what  ways  this  lemma 
can  be  violated.  If  there  are  less  than  iu  U-tasks  preceding 
T^[2]  then  the  total  length  of  tasks  preceding  T ^ [ 2 ] 
(including  TV  [2j ,  3  <  i)  on  processor  2  is  at  most  (ig  + 

(iu  -  l)v  +  nK) .  This  is  less  than  the  time  interval,  ti, 
wnich  it  has  to  cover  as 

ig  +  (iu  -  l)v  +  nK  <  it  =  i(uv  +  g  +  K)  . 

Thus,  there  are  at  least  iu  U-tasks  preceding  T\[2J.  wow, 
let  k^,  01il(n  -  1),  be  the  number  of  U-tasks  in  the 
interval  Detween  T ^  L 2 j  ana  Ti+^(2]  . 


Suppose  k 


k 


P 


q 


u  +  1 


and  k^  =  u  for  p  <  i  <  q. 
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Then,  there  are  (  (q  +  1  -  p)  u  +  2)  U-tasks  between  T  [ 2 j  ana 

P 

T  ^[2]  with  total  length  at  least  (  (q  +  1  -  p)u  +  2)v. 
Together  with  the  tasks  T.[2],  piiLq,  this  gives  total 
length  at  least  ( (q  +  1  -  p)u  +  2)v  +  (q  +  1  -  p)g  which  is 
greater  than  (q+l-p)t+v+  K/2.  Since  the  latter  value 
expresses  the  maximum  time  interval  available  for  the  tasks, 
as  implied  by  Lemma  5.5,  this  is  not  possible. 

Collecting  these  facts  together,  namely,  that 

k^  =  (u  -  1)  ,  u  or  (u  +  1)  ,  that  X^_-^(k^)  2  iu,  and  that  it 

is  not  possible  to  nave  k  =  k  =  (u  +  1)  ana  k.  =  u  for 

^  p  q  i 

p  <  i  <  q,  one  can  conclude  that  the  (u  +  1)  and  (u  -  1) 
values  of  k^  occur  alternately  starting  with  a  (u  +  1)  ana 
finishing  witn  a  (u  -  1) ,  witn  the  u  values  interspersed. 

It  is  now  shown  that  the  oouna  D  is  exceeded  if  k  ^  is 

not  equal  to  u  for  01i£(n  -  1) .  A  new  lower  bound  Uc  for  the 

flow  or  the  u-;jobs  can  now  be  computed  as 

Uc  =  Xi  =  0(^Al(g  +  jv  +  ti  +  Ep). 

The  term  Ei  2  0  will  compensate  for  the  fact  that  the  TL  [  2  J 

cannot  start  early  enough  after  a  k  =  u  +  1  ana  before  the 

ir 

following  k  =  u  -  1. 
si 

u  -  U  =  -3nK  +  (g+jv+ti+E. )  -  (g+jv+ti)  )  . 

With  respect  to  the  sequence  k^,  outside  of  those 
subsequences  starting  with  a  (u  +  1)  and  ending  with  a 
(u  -  1) ,  tne  Ei  term  is  zero  ana  the  corresponding 
summations  in  the  above  cancel  out.  However,  it  is  shown 
below  that  if  there  is  at  least  one 
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(u  +  1),  u,  .  ..,  u,  (u  -  1)  such  suDsequence  then 

ljc  +  l'c  “  Ue  -  Te  >  0 ,  wnere  T  is  a  corr espono  ing  oouno  tor 

the  T- jots. 

Let  k^,  k^  be  one  sucn  subsequence.  Task  T^[2], 

p<  i^q ,  cannot  start  until  after 

Pt  +  (i  -  p)g  +  ( ( i  -  p)u  +  1 ) v  =  ti  +  v  -  (i  -  p)K. 
Thus  =  0  and  Ei  =  v  -  (i  -  p)K,  p<i£q. 

In  tne  expression  for  (U  -  U  )  the  summations  corresponding 
to  the  subsequence  are 

^3  =  1 (g  +  +  pt)  -  Zj  =  1(g  +  dv  +  pt) 

+  ^Ip+1(^=1(v  -  (i  -  P)K), 

+  Z^(g  +  qt  +  jv  +  v  -  (q  -  p)K)  -  Sj=1(g  +  qt  +  jv). 

After  simplification,  this  becomes 

- (q  -  P)  (9  +  uK)  -  2j|“p+1  (u(i  -  p) K)  , 

wnere  the  summation  in  the  second  term  is  zero  if  q  =  p  +  1. 

however,  on  computing  the  lower  bound  T  for  the  T-jobs, 

c 

again  the  values  to  oe  summed  outsioe  the  subsequences  are 
the  same  as  in  the  summation  for  Tg.  For  the  suosequence 
aoove,  one  obtains 

ri=p+l(g  +  fci  +  v  -  <i  -  P)K)  -  5^=p+1(g  +  ti) 

=  V  -  (q  -  p)K  +  2g”p+1(v  -  (i  -  p)K) 
as  the  difference  in  the  two  summations.  Since 
v  -  ( q  -  p)  K  >  (q  -  p)  (g  +  uK)  +  3nK  and 

v  -  (i  -  p ) K  >  u(i  -  p) K ,  p  <  i  <  q,  the  overall  bound  is 
strictly  greater  than  D.  Note  that  the  3nK  term  in  the 
definition  of  u  has  oeen  taken  care  of,  wnile  the  oounds 


■ 


148 


lor  X  and  Y-tasks  remain  X  ana  Y  . 

e  e 

Thus  it  there  is  at  least  one  subsequence  ot  the  type 
aescnoea  the  flow  time  is  strictly  greater  than  D.  hence, 
k  =  u  tor  0£i£  (n-1)  .  y 

Lemma  5.8:  In  a  good  scheauie  satisfying  the  previous 
lemmas,  S(Ti  [2])  =  ti,  0d£n. 

Proof ;  Given  a  good  scheauie  which  satisfies  the 
previous  lemmas,  assume  that  S ( T . [ 2 ] )  >  ti  for  some  i, 

0£iln.  Then  the  following  lower  bounas  for  the  jods'  flow 
time  are  easily  aetermined;  Xg  for  the  x-jobs,  Ye  for  the 
Y-jobs,  (T  +  1)  for  the  T-jobs,  ana  finally  at  least 
(U  +  u  -  3nK)  for  the  U-jobs  (the  u  term  is  introaucea  by 
the  aeiay  on  the  u  U-jobs  between  1\[2]  ana  T^+^[2]).  Aading 
these  together  gives  a  bound  of  (D  +  u  +  1  -  3nK)  which  is 
greater  than  D.  Hence,  the  lemma  must  be  true.  Q 

This  conduces  the  proof  ot  the  theorem.  0 

5.3  heuristic  Solutions 

The  previous  section  snowed  the  2-processor  n-job  mean 
flow  time  scheduling  problem  for  the  open  shop  iSiP-comple  te . 
In  this  section,  tight  bounds  on  the  mean  flow  time  of  an 
arbitrary  schedule  and  ot  a  scheauie  obtained  using  the 
shortest  processing  time  (SPT)  first  heuristic  as  compared 
to  the  mean  flow  time  of  an  optimal  scneaule  are  obtained. 
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Recall  that  the  SPT  heuristic  is  optimal  for  the  1-processor 
case . 

Let  the  n  joos  be  j  ,  1  £  i  i  n.  The  j-th  task  of  the 
i-th  30b  is  [  j  ]  ana  has  execution  time  t ^ [ 3 ]  .  Let 

=  ^3=1(tiUJ)  and  T  =  ^=1(Ti).  Ti  is  the  total 
processing  time  for  the  i-th  30b,  while  T  is  the  total  time 
for  ail  the  n  30bs.  For  any  schedule  Z  the  mean  flow  time  of 
2  will  be  denoted  by  mft(Z).  The  notation  for  starting  ana 
finishing  time  or  a  tasK  (or  30b),  a,  will  be  as  usual  S(a), 
F(a) .  (If  2  is  subscripted,  S  and  f  may  be  given  the  same 
subscript  to  indicate  clearly  that  starting  and  finishing 
times  are  given  with  respect  to  the  particular  schedule  Z). 

Theorem  5.2:  Let  2  be  an  optimal  mean  flow  schedule 

- — - —  o 

for  an  m-processor  open  shop  with  n  30bs.  Let  2  be  an 

arbitrary  schedule.  Then,  ^  n* 

^o' 

Proof :  without  loss  of  generality,  assume  that  the  3obs 
are  completed  in  the  order  J^,  In  the  worst  case, 

no  task  of  30b  J\  is  started  before  has  oeen  completed. 

hence,  F ( J 1 )  ^  ^-r  =  i^Tk^‘ 

row,  mft(Z)  =  Z^=1(F(Ji))  £  (^=1  (T^)  ) 

=  Z^=i(n  +  1  -  i)T.  i  nT. 

mft(Zo)  =  Zi=l(Fo(Ji>>  1  ^(Ti)  =  T 

i  mft(Z)/n.  0 

The  bound  given  above  is  asymptotically  tight  as  illustrated 
by  the  following  example. 


. 


150 


Example  5.1:  m  =  2.  The  iobs  are  J,  .  J_ . ,1  .  . 

- -  2  n+1 

where  t±  [1 J  =  t±  [ 2]  =  1,  1  £  i  £  n; 

tn+lIli  =  x;  Vn121  =  x  >  n- 

The  schedules  Z  ana  Zq  are  given  in  Figure  24.  In  the 
figure,  tasks  tor  the  i-th  job  are  indicated  by  integer  i. 
(In  general,  an  optimal  mean  flow  schedule  is  not  a  minimal 
length  scneaule.  This  explains  the  fact  that  scneaule  ZQ  in 
Figure  24  is  actually  longer  than  scneaule  Z.) 


mf  t ( Z ) 


-n 


=  (x  +  1)  +  ^i=i(x  +  i) 

=  (n  +  1)  x  +  n(n  +  l)/2  +  1. 

mf  t  ( Zq )  =  2^_^(i  +  1)  +  n  +  x  +  (n  +  1) 
=  x  +  n (n  +  5) /2 . 


Tner ef ore , 


mf  t ( Z ) 
mf t (ZQ) 


( n  +  1 )  x  +  n(n  +  l)/2  +  1 

x  +  (n  +  5)/2 


which  approaches  n+1  as  x  approacnes  infinity. 


Now  consider  the  case  for  the  SPT  heuristic,  in  which 
jobs  are  processed  in  order  of  non-decreasing  processing 
time.  The  rule  is  normally  implemented  as  follows:  Suppose 
that  tne  j-th  processor  is  available,  ana  jobs  J.  and  J, 

1  K 

nave  no  tasks  currently  under  execution  and  their  tasxs  for 

tne  j-th  processor  nave  not  yet  oeen  executed,  then  J^lj]  is 

chosen  to  execute  before  J  [j ]  if  T.  £  T  . 

k  ik 

Theorem  5.3:  Let  Z  be  an  optimal  mean  flow  time 
- - -  o 

schedule  for  an  m-processor  n-job  flow  shop.  Let  Z  be  a 
schedule  constructed  with  tne  SPT  heuristic. 
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Tnen, 


mf t  ( Z  ) 
s 

mft(Z  )  1  111 

o 


Proof :  Assume  £ 


£  T 


n 


mf  t ( Z  s ) 


=  z" 


Let  (q(l),  q ( 2) , . . . ,q (n) ) ,  a  permutation  of  the  first  n 
integers,  tse  the  order  in  which  the  jobs  are  completed  in 


schedule  Z  .  Then, 
o 

F  ( J  ,  .  ,  ) 

o  q  (l) 


>1,  ,  (T  , ,  >  /m) 

k=l  q (K) 


Therefore , 


mf  t ( Z  )  =  ^  ,  ( F  ( J  ,  •  . 

'  o'  1  =  1 v  o  v  q ( l) 


)  ) 


5WVn)- 


xi=i<>;k=i<V,)) 


mf  t ( Z  ) /m . 
s 


As  for  the  previous  theorem,  the  oound  is 
asymptotically  tight  as  illustrated  by  Example  5.2. 


Example  5.2:  There  are  m+1  jobs,  J  ,  ...  tor  m 

processors,  where 

1  for  j  =  1. 

t-,  L  j  ] 

0  for  2  1  j  £  m. 

2  for  j  =  1,  or  j  =  i, 

t .  [  j  ] 

2  £ 1  i  £  m  0for2<lj^m,j^i. 

x  for  j  =  1  (x  >  2)  . 

Vlli]  =  2  for  3  -  2. 

0  for  3  i  j  £  in. 

The  SPT  and  optimal  scneaules,  Z_,  Z  ,  are  given  in 

•D  U 

Figure  25.  As  in  Example  5.1,  tasks  for  the  i-th  job  are 
indicated  in  the  figure  oy  integer  i. 
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FIGURE  25: 

Schedules  for  Example  5.2 
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mf t ( z  ) 
s' 


mft(Z  ) 
o 


mf t(Zs) 


1  +  (X  +  3)  +  +  1  +  2  i) 


Theretore'  srtro 


mx  +  m  +  3 . 

l+2m+2  +  x  +  Z™“*2(i  +  1) 
x  +  m  (m  +  3 )  +  1 . 


mx  +  m  +  3 
x  +  m  (  itT  _+  SY  +  1 


which  approaches  m  as  x  approaches  infinity. 


5.4  Discussion 

The  main  contribution  of  this  chapter  is  the  reauction 
from  3-PARTITION  of  the  2-processor  open  shop  mean  flow  time 
scheduling  problem,  thus  showing  the  problem  to  be 
NP-complete.  One  may  conclude  as  well  that  the  problem 
remains  NP-complete  when  the  number  of  processors  m  >  2. 

Tnus  the  relaxation  of  the  constraint  that  each  job's  tasks 
be  processed  in  the  same  processor  order  in  the  flow  shop 
model,  yielding  an  open  shop,  still  leaves  an  intrinsically 
difricult  problem. 

In  addition  to  the  above,  tight  bounds  have  been 
derived  ior  the  mean  flow  time  of  an  arbitrary  scneduie  and 
for  an  SPT  schedule  in  terms  of  the  optimal  mean  flow  time. 
Since  tne  number  of  jobs  is  usually  much  larger  than  the 
number  of  processors,  the  bounds  indicate  some  advantage  of 
SPT  schedules  over  arbitrary  schedules. 


. 


Chapter  Six 


CONCLUDING  REMARKS 

Deterministic  processor  scheduling  is  of  practical  and 
theoretical  value  to  computer  science  as  well  as  many  other 
disciplines  ana  as  such  its  study  is  a  worthwhile  enoeavour. 
In  this  thesis,  several  nioaels  of  scheaulmg  theory  nave 
oeen  considerea,  relevant  previous  worK  surveyed  and  some 
significant  results  were  ootainea. 

For  some  minimal  length  problems,  polynomial  algorithms 

6 

have  oeen  aevelopea;  namely,  an  G(n  )  algorithm  for  LMS  ana 

2 

ShL  three-processor  flow  shops  and  an  G(n  )  algorithm  for  an 
m-processor  bound  UET  system  with  two  task  chains.  An  0(nzI) 
algorithm  is  also  given  for  a  tree-structured  set  of  tasks 
on  a  2-processor  bound  UET  system,  where  I  is  the  number  of 
terminal  subsets  of  tne  tree.  The  later  algorithm  can  be 
extenaea,  with  corresponaing  increase  in  processing  time,  to 
more  than  two  processors.  This  algorithm  is  not  polynomial 
in  n  out  is  a  significant  improvement  over  the  alternative 
of  simple  enumeration.  Furthermore,  the  aynamic  programming 
technique  outiinea  m  the  algorithm  can  be  appliea  with  any 
terminal  suoset  enumerator  to  proviue  solutions  for  similar 
systems  with  more  complex  precedence  constraints. 

Several  proolems  were  also  shown  to  be  NP-comple te . 
These  are  minimizing  scneaule  length  on  two-procesor  bouna 
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OLT  systems  even  wflen  the  precedence  constraints  consist  or 
chains  only,  2-maximal  three-processor  flow  shops  ana  1  or  3 
maximal/minimal  flow  shops,  and  minimizing  the  mean  flow 
time  on  the  two-processor  open  shop.  In  every  case,  the 
result  trivially  holds  when  the  number  of  processors  is 
increased,  with  the  exception  of  the  2-maximal  flow  shop, 
tne  results  are  strong  NP-complete  reductions  from  the 
3 -PARTITION  problem. 

Finally,  in  the  area  of  performance  bounds,  tignt 
oounas  were  Obtained  on  the  lengths  of  list  schedules  on 
identical  processors  tor  independent  tasks  with  similar 
execution  times,  ana  on  the  mean  flow  times  of  arbitrary  ana 
SPT  schedules  for  tne  open  shop. 


There  are  still  many  challenging  open  problems  in  this 
field.  The  distinction  between  problems  which  have 
polynomial  solutions  and  the  NP-complete  problems  is  a 
useful  one  and  there  are  problems  for  which  this 
classification  is  yet  to  be  accomplished.  These  include 
minimizing  the  schedule  length  for  equal  execution  time 
tasks  for  a  fixed  number  of  processors  m23 ,  and  minimizing 
the  mean  flow  time  lor  m  processors,  equal  execution  time 
tasKS  ana  equal  weigths,  wnere  the  precedence  relations 
constitute  a  tree  or  forest. 


There  is  also  mucn  to  be  done  in  the  design  ana 
analysis  of  heuristic  algorithms  tor  tne  hard,  NP-complete, 
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problems.  The  fairly  recent  tecrinique  or  approximation 
algorithms  (Sahni,  1975)  is  virtually  untappea  and  may  lead 
to  exciting  new  solutions. 

Althougn  the  work  presented  here  has  concentrated  on 
the  more  common  models,  there  are  otners  more  suitaole  and 
realistic  for  some  applications.  These  include  some  types  of 
processor  Pound  systems  such  as  the  30b  shop  (Baker,  1974) 
and  typed  systems  (Liu  and  Liu,  1977;  Jaffe,  1978).  These 
models  as  well  as  those  considered  in  this  thesis  may  also 
pe  studied  in  connection  with  other  performance  measures, 
such  as  minimizing  lateness  and  tardiness,  and  minimizing 
the  mean  number  of  tasks  in  the  system  at  any  time. 
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